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A Novel Protein Kinase Required for Ras Signal Transduction 

The research carried out in the subject application was supported in part by grants from the 
National Institutes of Health. The government may have rights in any patent issuing on this 
application. 

5 

INTRODUCTION 

Field of the Invention 

The field of the invention is a protein kinase required for Ras signal transduction and its use 
in pharmaceutical screens. 

10 

Background 

Ras plays a crucial role in diverse cellular processes, such as proliferation and 
differentiation, where it functions as a nodal point transmitting signals originating from receptor 
tyrosine kinases (RTKs) to a variety of effector molecules (reviewed in McConnick, 1994a; van der 

15 Geer et aL 1994: Burgering and Bos, 1995). Ras activation, which involves a switch from an 
inactive GDP-bound to an active GTP-bound state, is promoted by a guanine nucleotide-exchange 
factor. Upon RTK activation, the exchange factor is recruited by an SH2/SH3 domain-containing 
adaptor molecule to the RTK at the plasma membrane where it can contact and activate Ras. GTP- 
bound Ras then transmits the signal to downstream effector molecules. 

20 The protein serine/threonine kinase Raf has been identified as a major effector of Ras 

(reviewed in Daum et ah, 1994; McCormick. 1994b). Upon Ras activation, Raf is recruited to the 
plasma membrane by a direct interaction with Ras, where it is subsequently activated by an 
unknown mechanism. Raf activation initiates an evolutionarily conserved pathway involving two 
other kinases, MEK (MAPK Kinase) and MAPK (Mitogen- Activated Protein Kinase) that convey 

25 signals to the nucleus through a directional series of activating phosphorylations (reviewed in 
Marshall. 1994). Although this model for Ras-dependent signal transduction is well-supported, 
there are still major issues that remain poorly understood. One of them is the mechanism by which 
Raf is activated. Recent evidence suggests that once recruited to the plasma membrane Raf is 
activated by phosphorylation (Dent and Sturgill, 1994; Dent et ah, 1995). However, a candidate 

30 kinase(s) has yet to be identified. Another unresolved issue is the nature of other Ras effectors as 
well as the pathways they control. Although Raf is clearly a major Ras target, it can not account 
for all of the cellular responses mediated by Ras (for example see White et aL, 1995). 
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Ectopic expression of an activated Rasl allele, Rasl V12 , in the developing Drosophila eye 
transforms non-neuronal cone cells into R7 photoreceptor cells (Fortini et al., 1992). Similar results 
are obtained by expression of an activated Drosophila Raf allele. D-Raf Torta21 (Dickson et al., 1992). 
We carried out a genetic screen designed to isolate mutations that modify the signaling efficiency 
of Rasl™. Most mutations that decreased the signaling efficiency of Rasl v,J also decreased the 
efficiency of D-Raf To "^' signaling. However, two groups of mutations were identified that did 
not alter D-Raf 0 " 04 "' signaling. We disclose here the characterization of their respective loci. The 
Suppressor of Rasl 2-2 (SR2-2) locus encodes a protein homologous to the catalytic subunit of the 
prenylation enzyme type I geranylgeranyl transferase. We have renamed this locus fiGGT-1. The 
second locus, SR3-I. encodes a novel protein kinase distantly related to Raf kinase members. 
Based on its sequence and the ability of mutants to reduce Rasl-mediated signaling, we renamed 
this locus kinase suppressor ofras (ksr). In addition to its function in the Sevenless RTK pathway, 
we show that ksr is also required for signaling by the Torso RTK We have isolated mouse and 
human homologs of ksr. Together, these data indicate that Ksr is an evolutionarily conserved 
component of the Ras signaling pathway. As such, the human Ksr provides an important target for 
pharmaceutical intervention. 

Relevant Literature 

Recent reports on Raf activation include Dent and SturgUl 1994; Dent et al., 1995; White 
et al, 1995, Yao et al, 1995; and a recent review by Marshall, 1994. 

SUMMARY OF THE INVENTION 
The invention provides methods and compositions relating to a novel protein kinase 
involved in the regulation of cell growth and differentiation: kinase suppressor of Ras (Ksr). As 
such, the kinase provides an important target for therapeutic intervention. The subject compositions 
also include nucleic acids which encode a Ksr kinase, and hybridization probes and primers capable 
of hybridizing with a Ksr gene. Such probes are used to identify mutant Ksr alleles associated with 
disease. 

The invention includes methods for screening chemical libraries for lead compounds for a 
pharmacological agent useful in the diagnosis or treatment of disease associated Ksr activity or Ksr- 
dependent signal transduction. In one embodiment, the methods involve (1) forming a mixture 
comprising a Ksr , a natural intracellular Ksr substrate or binding target such as the 14-3-3 gene 
product, and a candidate pharmacological agent; (2) incubating the mixture under conditions 
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whereby, but for the presence of said candidate pharmacological agent, said Ksr selectively 
phosphorylates said substrate or binds said binding target at a control rate; and (3) detecting the 
presence or absence of a change in the specific phosphorylation of said substrate by said Ksr or 
phosphorylation or binding of said Ksr to said binding target, wherein such a change indicates that 
said candidate pharmacological agent is a lead compound for a pharmacological agent capable of 
5 modulating Ksr function. 

DETAILED DESCRIPTION OF THE INVENTION 
A Drosophila melanogaster, a Drosophila virilis, a murine and a human ksr encoding 
sequence are set out in SEQ ID NO: 1, 3, 5 and 7, respectively. A Drosophila melanogaster, a 

10 Drosophila virilis, a murine and a human ksr protein sequence are set out in SEQ ID NO: 2, 4, 6 and 
8 r respectively. Ksr proteins necessarily include a disclosed ksr kinase domain. Hence, Ksr 
proteins include deletion mutants of natural ksr proteins retaining the ksr kinase domain. 

Natural nucleic acids encoding ksr proteins are readily isolated from cDNA libraries with 
PCR primers and hybridization probes containing portions of the nucleic acid sequence of SEQ ED 

15 NO: 1. 3.5 and 7. Preferred ksr nucleic acids are capable of hybridizing with one of these sequences 
under low stringency conditions defined by a hybridization buffer consisting essentially of 1% 
Bovine Serum Albumin (BSA); 500 mM sodium phosphate (NaPOJ; 1 mM EDTA; 7% SDS at a 
temperature of 42°C and a wash buffer consisting essentially of 2X SSC (600 mM NaCl; 60 mM 
Na Citrate); 0.1% SDS at 50°C; more preferably under low stringency conditions defined by a 

20 hybridization buffer consisting essentially of 1% Bovine Serum Albumin (BSA); 500 mM sodium 
phosphate (NaP04); 15% formamide; 1 mM EDTA: 1% SDS at a temperature of 50°C and a wash 
buffer consisting essentially of IX SSC (300 mM NaCl; 30 mM Na Citrate); 0.1% SDS at 50°C: 
most preferably under low stringency conditions defined by a hybridization buffer consisting 
essentially of 1% Bovine Serum Albumin (BSA); 200 mM sodium phosphate (NaP04); 15% 

25 formamide; ImM EDTA; 7% SDS at a temperature of 50°C and a wash buffer consisting 
essentially of 0.5X SSC (150 mM NaCl; 15 mM Na Citrate); 0.1% SDS at 65°C 

The subject nucleic acids are recombinant, meaning they comprise a sequence joined to a 
nucleotide other than that to which sequence is naturally joined and isolated from a natural 
environment. The nucleic acids may be part of Ksr-expression vectors and may be incorporated 

30 into cells for expression and screening, transgenic animals for functional studies (e.g. the efficacy 
of candidate drugs for disease associated with expression of a Ksr), etc. These nucleic acids find 
a wide variety of applications including use as templates for transcription, hybridization probes, 
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PCR primers, therapeutic nucleic acids, etc.; use in detecting the presence of Ksr genes and gene 
transcripts, in detecting or amplifying nucleic acids encoding additional Ksr horaologs and 
structural analogs, and in gene therapy applications, e.g. using anusense nucleic acids or ribozymes 
comprising the disclosed Ksr sequences or their complements or reverse complements. 

The invention also provides Ksr-specific binding reagents such as antibodies. Such reagents 
find a wide variety of application in biomedical research and diagnostics. For example, antibodies 
specific for mutant Ksr allele-products are used to identify mutant phenorypes associated with 
pathogenesis. Methods for making allele-specific antibodies are known in the art. For example, 
an roKsr-specific antibody was generated by immunizing mice with a unique N-terminal mKsr 
peptide (residues 1 18-249) GST fusion. 

The invention provides efficient methods of identifying pharmacological agents or lead 
compounds for agents active at the level of a Ksr modulatable cellular function, particularly Ksr 
mediated signal transduction. For example, we have found that a binding complex comprising Ksr, 
14-3-3 and Raf exists in stimulated cells; modulators of the stability of this complex effect signal 
transduction. Generally, the screening methods involve assaying for compounds which interfere 
with a Ksr activity such as kinase activity or target binding. The methods are amenable to 
automated, cost-effective high throughput screening of chemical libraries for lead compounds. 
Identified reagents find use in the pharmaceutical industries for animal and human trials; for 
example, the reagents may be derivatized and rescreened in in vitro and in vivo assays to optimize 
activity and immmize toxicity for pharmaceutical development Target therapeutic indications are 
limited only in that the target cellular function be subject to modulation, usually inhibition, by 
disruption of the formation of a complex comprising Ksr and one or more natural Ksr intracellular 
binding targets including substrates or otherwise modulating Ksr kinase activity. Target indications 
may include infection, genetic disease, cell growth and regulatory or immunologic dysfunction, such 
as neoplasia, inflammation, hypersensitivity, etc. 

A wide variety of assays for binding agents are provided including labeled in vitro kinase 
assays, protein-protein binding assays, immunoassays, cell based assays, etc. The Ksr compositions 
used in the methods are recombinantly produced from nucleic acids having the disclosed Ksr 
nucleotide sequences. The Ksr may be part of a fusion product with another peptide or polypeptide, 
e.g. a polypeptide that is capable of providing or enhancing protein-protein binding, stability under 
assay conditions (e.g. a tag for detection or anchoring), etc. 

The assay mixtures comprise one or more natural intracellular Ksr binding targets including 
substrates, such as the 14-3-3 gene product, or, in the case of an autophosphotylation assay, the Ksr 
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itself can function as the binding target. A Ksr-derived pseudosubstrate may be used or modified 
(e.g. A to S/T substitutions) to generate effective substrates for use in the subject kinase assays as 
can synthetic peptides or other protein substrates. Generally, Ksr-specificity of the binding agent 
is shown by kinase activity (i.e. the agent demonstrates activity of an Ksr substrate, agonist, 
antagonist, etc.) or binding equilibrium constants (usually at least about 10 6 Nf l , preferably at least 
5 about i0 8 M*\ more preferably at least about 10 9 M"°. A wide variety of cell-based and cell-free 
assays may be used to demonstrate Ksr-specific binding; preferred are rapid in vitro, cell-free assays 
such as mediating or inhibiting Ksr-protein binding, phosphorylation assays, immunoassays, etc. 

The assay mixture also comprises a candidate pharmacological agent Candidate agents 
encompass numerous chemical classes, though typically they are organic compounds; preferably 

10 small organic compounds and are obtained from a wide variety of sources including libraries of 
synthetic or natural compounds. A variety of other reagents may also be included in the mixture. 
These include reagents like salts, buffers, neutral proteins, e.g. albumin, detergents, etc. which may 
be used to facilitate optimal binding and/or reduce non-specific or background interactions, etc. 
Also, reagents that otherwise improve the efficiency of the assay, such as protease inhibitors, 

15 nuclease inhibitors, antimicrobial agents, etc. may be used. 

In a preferred in vitro, binding assay, a mixture of a protein comprising at least one of the 
conserved Ksr domains, including CA1, CA2, CA3, CA4 and the kinase domain (see Table 1), one 
or more binding targets or substrates and the candidate agent is incubated under conditions whereby, 
but for the presence of the candidate pharmacological agent, the Ksr specifically binds the cellular 

20 binding target at a first binding affinity or phosphorylates the substrate at a first rate. After 
incubation, a second binding affinity or rate is detected. Detection may be effected in any 
convenient way. For cell-free binding assays, one of the components usually comprises or is 
coupled to a label. The label may provide for direct detection as radioactivity, luminescence, 
optical or electron density, etc. or indirect detection such as an epitope tag, an enzyme, etc. A 

25 variety of methods may be used to detect the label depending on the nature of the label and other 
assay components. For example, the label may be detected bound to the solid substrate or a portion 
of the bound complex containing the label may be separated from the solid substrate, and thereafter 
the label detected. 

The following experiments and examples are offered by way of illustration and not by way 
30 of limitation. 

EXPERIMENTAL 
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Mutationsinthe5/ ? 2.2and5/0-/ loci suppress the eye phenotype of activated Rasl but not 
that of activated D-Raf. 

Ectopic expression of activated Rasl (Rasl™) mder contro , of ^ 
promoter/enhancer sequences (sev-Rasl™) transforms cone cells into R7 photoreceptor cells 
(Fortmi et al., 1992). These extra R7 cells disorganize the ommatidial array, which causes a 
roughening of the external eye surface. The severity of eye roughness appears proportional to the 
strength of Rasl -mediated signaling since two copies of the transgene produce a much more 
disrupted eye than one copy. We took advantage of this sensitized system to conduct a screen for 
mutations that reduce (suppressors) or increase (enhance*) the degree of eye roughness We 
reasoned that a two-fold reduction in the dose of a gene (by mutating one of its two copies) that 
functions downstream of Rasl should dominant* alter signaling strength which in cum should 
vismly modify the roughness of the eye. Based on this assumption, we screened -200,000 EMS- 
and -650,000 X-ray-rnutagenized progeny for dominant modifiers of the Rasl "'-mediated rough 
eye phenotype. 18 complementation groups of suppressors with multiple alleles and 13 
complementation groups of enhancers of sev-Rasl™ were isolated. 

To characterize further the various groups of suppressors, we tested their ability to suppress 
dominantiy the extra R7 cell phenotype caused by overexpression of an activated Drosopbila Raf 
aUe le Sincc Raf functions directly downstream of Ras, we expected most of our 

suppressorgmupstomodifys^ 

suppressor groups, SR2-2 and SR3-I did not reduce the number of extra R7 cells produced by D- 
Raf*"™ expression. Scanning electron micrographs of adult eyes illustrate the suppressor 
phenotypes of one SR3-1 allele. Similar results were obtained with multiple SR2-2 and SR3-1 
alleles. We also monitored the suppression of extra R7 cells by counting the number of R7 
photoreceptors in cross-sections of adult fly retinae. In wild-type there is one R7 ceU per 
ommatidium, whereas in sev-Rasl™/* fli es we observed 2.3 (n=437) R7 cells per ommatidium 
Tim number was reduced to L2(n=481)R7 cells per ommatidium in sev-Rasl™/+ ■ SR3 
Hies. fcW^V + flies, 23 (n=302)R7ce^ However this 

number remained at 2.3 (n=474) in sE-Raf™,*; S R3.1»», + flies reflecting the inability of SR3- 
1 muutions to alter sE-Raf 0 ^ 2 ' signaling strength. 

Targeting of Rasl™ lo the plasma membrane by myristylation distinguishes SR2-2 from SR3-1 

Prenylation of the C-termina. CAAX box (Cysteine. A=aliphatic residue, X=any amino 
aad) ls the major post-translational modification specific to all Ras-like GTPases When the 
-idue at position "X" is a leucine, as in Rasl, a geranylgerany, group is added by a type 1 
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geranylgeranyl transferase. The addition of this lipidic moiety is required to attach Ras to the 
plasma membrane (reviewed in Glomset and Famsworth, 1994). Deletion of the CAAX box 
abolishes Ras function (Willumsen et al., 1984; Kato et al.. 1992), however its activity can be 
restored if it is brought to the membrane by another localization signal, such as a myrisryl group 
(Bussetal., 1989). 

One possibility to account for the ability of a mutant to suppress sev-RasJ vn but not sE- 
Raf ermi is that the locus encodes an enzyme that is required for the membrane localization of Ras 1 . 
Consequendy. mutations in this locus would not affect D-Raf Tort02 '. To direcdy test this possibility, 
we asked if SR2-2 or SR3-J alleles could suppress activated Rasl if it is targeted to the membrane 
by an alternative mechanism. We targeted Ras l v,z to the membrane by fusing the first 90 amino 
acids of Drosophiia Src kinase (D-Src; Simon et al., 1985), which contains a myristylation signal, 
to Ras I Vl2 deleted of its CAAX box (sev-Src90Rasl v,2MMX ). While the CAAX box-deleted Rasl™ 
is inactive, Src90Rasl vl2ACAAX produces the same phenotype as Rasl v ' 2 ; that is, it generates extra 
R7 cells and disrupts the ommatidial array. 

We crossed sevSrc90Rasl vnaCMX flies to SR2-2 and SR3-1 alleles and analyzed the rough 
eye phenotype. SR2-2 s *"° did not suppress the rough eye phenotype while SR3-l s * ia suppressed 
the rough eye phenotype and the production of extra R7 cells. These observations indicate that SR2- 
2 is involved in prenylation of Ras 1 while SR3-1 encodes a component of the Ras 1 pathway that 
is not involved in the process of Rasl membrane localization. 

The SR2-2 locus encodes the Drosophiia homolog of the 6-subunit of type I geranylgeranyl 
transferase. 

The SR2-2 locus was meiotically mapped to 2- 15 (cytological position 25B-C). based on the 
ability of different mutant alleles to suppress sev-Rosl v ' 2 . One of the seven recessive lethal SR2-2 
alleles recovered contains an X-ray-induced inversion (SK2-2"' M ) with a breakpoint at 25B4-6. 
Genomic DNA spanning this breakpoint was isolated and used to screen a Drosophiia eye-antennal 
imaginal disc cDNA library (see Experimental Procedures). A single class of cDNAs (ranging in 
size from 0.8 to 1.6 kb) defining a transcription unit disrupted by the inversion present in SR2-2 5 
2,26 , was identified and characterized. Conceptual translation of the longest open reading frame 
(ORF) defined by these cDNAs predicts a protein of 395 amino acids. Determination of the gene 
stnicture by sequencing the corresponding genomic region revealed four exons with the first in- 
frame methionine located at the beginning of the second exon. The SR2-2 31 ' 26 inversion breakpoint 
maps to the 5-end of the transcript. Further confirmation that this ORF corresponds to the SR2-2 
gene, was provided by sequence analysis of two other mutant alleles, SR2-2 S ^ SJ and SR2-2 S both 
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of which have small deletions that remove the first exon and part of the 5' regulatory sequences. 
A search of the current protein databases with this ORF indicated that the SR2-2 gene encodes the 
Drosophila homolog of the catalytic B-subunit of type I geranylgeranyl transferase (BGGT-I) 
(Marshall. 1993). Sequence alignment with the human and the yeast S. pombe BGGT-I proteins 
shows a high degree of evolutionary conservation. The human sequence rs 44% identical (69% 
similar) to the Drosophila sequence throughout the entire ORF while the yeast sequence rs 36% 
identical (57% similar) to the Drosophila protein. We therefore renamed this locus. J3GC7V. 
The SR3-1 locus encodes a novel protein kinase. 

The ability of SR3-J mutant alleles to suppress the sev-Rasl™ phenotype was meiotically 
mapped to 3-47.5. which corresponds to a legion near the chromocenter of the third chromosome. 
The map position was further refined by showing that SR3-J meiotically maps between two P- 
elemems inserted at 82F8-10 and 83A5-6, respectively. X-ray-induced chromosomal deletions 
were generated by selecting w revertants of one of the P-element insertions. One such deletion. 
Df(3R)el025-I4. which removes the chromosomal region from 82F8-10 to 83A1-3. complemented 
the flU-i-associated lethality. Taken together, these results indicated that the SR3-1 locus lies 
between 83A1-3. the distal breakpoint of Df(3R)el025-14, and 83A5-6, the insertion site of 
P[w+]5E2. 

Five overlapping cosmids which cover this chromosomal region were recovered by 
chromosome walking. To identify restriction site polymorphisms that might have been induced in 
the SR3-J alleles, these cosmids were used to probe genomic DNA blots prepared from 9 
independent X-ray-induced SR3-1 alleles. Cosmid ffl revealed polymorphisms in a BamHI 
restriction digest of two alleles. SR3-1*" and SR3-1"". No other cosmid revealed polymorphisms 
in the 9 tested alleles. A 7 kb Sacfl genomic fragment which spans the polymorphic BamHI 
fragments was introduced into the germline by P-element-mediated transformation. This genomic 
fragment, tested in transgenic flies, rescued both the lethality and the sev-Rasl ""-suppression ability 
of three independent SR3-1 alleles. A single class of cDNAs that was totally encoded by the 7kb 
genomic fragment was identified by screening a Drosophila eye-antennal imaginal disc cDNA 
library and sequenced. The longest cDNA clone represents a transcript of 3.6 kb which is close 
to the size of a full-length transcript since RNA blot analysis identified a single band of similar size. 
Sequence analysis of the genomic region revealed that this transcript is encoded by a single exon. 
Conceptual translation of the longest ORF predicts a protein of 966 amino acids. The presence of 
an in-frame stop codon upstream of the predicted initiating methionine indicates that this cDNA 
contains the complete ORF. 



8 

SUBSTITUTE SHEET (RULE 26) 



WO 97/21820 



PCT/US96/19941 



A search of cuircnt protein databases indicated that SR3-1 encodes a novel protein kinase. 
The putative catalytic domain, which is C-terminal, contains the characteristic eleven conserved 
sub-domains found in eukaryotic kinases (Hardie and Hanks, 1995) and is preceded by a long N- 
terminal region with three distinctive features: a cysteine-rich domain similar to those found in 
Protein Kinase C isozymes (Hubbard et al., 1991) and Raf kinases (Bruder et al., 1992); four 
sequences that match the consensus phosphorylation site (PXS/TP) for MAPK (Marshall, 1994); 
and a block of amino acids rich in serines and threonines followed by a conserved motif 
(FXFPXXS/T) that resembles the sequence around the Conserved Region 2 (CR2) domain of Raf 
kinases (Heidecker et al., 1992). Since the SR3-1 locus encodes a putative protein kinase and 
mutant alleles were isolated as suppressors of sev-Rasl vl2 % we renamed this locus kinase suppressor 
ofras (ksr). 

Further confirmation that this gene corresponds to the ksr (SR3-1) locus was provided by 
sequencing three ksr alleles which revealed mutations disrupting the Ksr ORF (Table 1). 
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Table 1 provides a detailed comparison of the predicted amino acid sequence of Ksr kinases. 
Conceptual translation of the open reading frame from the longest D. melanogaster (Dm) Ksr cDNA 
is shown. The positions of mutations in three ksr alleles are indicated: S-548 is a 4 bp X-ray- 
induced mutation affecting two consecutive codons (CTG-CGA to AGT-GGA). S-638 is an EMS- 
induced allele that has two separate point mutations changing a GCC codon to GTC and GCG 
codon to ACG. S-721 is a frameshift mutation due to a 10 bp duplication from adjacent sequences 
within the codon for asparagine-727. Also shown in the alignment are the conceptual translations 
of the open reading frames for the Ksr genes from other species: the D. virilis (Dv) Ksr sequence 
was derived from genomic DNA, the mouse (m) Ksr-1 from a 4 kb cDNA, and the human (h) Ksr- 
1, deduced from three overlapping cDNA clones (the N-terminal two residues were absent from 
these clones so the numbering begins with the third residue). The human Ksr is present as one or 
more of a plurality of alternatively spliced forms, exemplified by Ksr' in the following sequence 
listing. The amino acid sequences (and their respective positions) for the cysteine-rich regions and 
the kinase domains of Drosophila (D-Raf) and human (h c-Raf) (Genbank accession number 
X07181 and X03484, respectively) are presented. Residues identical to Dm Ksr are lower case. 
In the N-terminus of the Ksr kinases four Conserved Areas (CA1 to CA4) are boxed. CA1 is a 
novel domain present only in the Ksr kinases. CA2 is a proline-rich stretch that may represent an 
SH3-binding site (Alexandropoulos et aL, 1995). CA3 is a cysteine-rich stretch, simlar to a domain 
found in multiple signaling molecules. This conserved sequence is also part of the CR1 domain 
found in Raf kinases (Bruder et aL. 1992). CA4 is a long serine/threonine-rich stretch followed by 
a conserved motif (indicated by a dashed line). This domain resembles the region around the CR2 
domain of Raf kinases (Heidecker et aL, 1992). The four short thick lines overlying the sequences 
indicate potential sites of phosphorylation by MAPK (PXS/TP) found in Dm Ksr. The eleven 
conserved sub-domains characteristic of protein kinases are indicated by roman numerals below 
their approximate positions. 

ksr*-* 31 has two single amino acids changes: alanine-696 to valine and alanine-703 to 
threonine. The latter substitution alters a highly conserved residue within kinase sub-domain II 
(Hanks et aL, 1988). fcir 5 "' contains a 10 bp insertion in the codon for asparagine-727 within 
kinase sub-domain m creating a frameshift mutation that truncates the protein at kinase sub-domain 
m. ksr 5 - 348 has a four base pair substitution that changes two consecutive amino acids in the N- 
terminus of the protein: leucine-50 and arginine-5 1 to glycine and serine, respectively. Unlike the 
16 alleles recovered in the screen which were recessive lethal, ksr" 49 produces sub-viable flies 
which have rough eyes (see below), indicating that it is a weak loss-of-function mutation. 
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Identification of Ksr homologs in other species defines a novel class of kinases related to 
Raf kinases. 

As a first attempt to determine functionally important domains that comprise the Ksr kinase, 
we searched for homologs from other species. First, we isolated the complete coding region of ksr 
from a Drosophila virilis genomic library by low-stringency hybridization (see Experimental 

5 Procedures). The D. virilis genomic sequence revealed a single uninterrupted ORF predicting a 
protein of 1003 amino acids (Table 1). The D. virilis and D. melanogaster Ksr proteins are 96% 
identical within the kinase domain while the N-terminal region is more divergent (69% identity), 
although islands of high conservation are present (see Table 1). 

A search of translated nucleotide databases (using the TBLASTN program; Altschul et at, 

10 1990) identified a partial ORF derived from a mouse DNA sequence with significant blocks of 
similarity to the N-terminus of Ksr. This sequence, named hb, had been isolated by Nehls et al. 
(1994) as part of an exon-trapping strategy to establish the transcription map of a I Mb region 
around the mouse NF1 locus. To determine if the full-length hb transcript also contains a kinase 
domain related to Ksr, we screened a cDN A library derived from a mouse PCC4 teratocarcinoma 

15 cell line with a probe corresponding to the hb sequence (see Experimental Procedures). A 4 kb 
cDNA clone was isolated and encodes a protein of 873 amino acids that contains a kinase domain 
highly related to the Ksr kinase domain (51% identity/74% similarity; Table 1). In addition, a 
human fetal brain cDNA library was screened at low-stringency with the same hb probe (see 
Experimental Procedures). Thirteen independent cDNA clones were purified and sequenced They 

20 represent partial transcripts ranging in size from 0.6 to 3 kb. Interestingly, they define at least three 
classes of N-tenninal splicing variants. The predicted protein sequence derived from overlapping 
human cDNA clones is shown in Table I. With the exception of the first divergent 23 amino acids, 
which probably represents an alternative exon, human Ksr-1 (hKsr-1) is nearly identical to mouse 
Ksr-1 (mKsr-1; 95% identity/99% similarity). Subsequent to this analysis, two human Expressed 

25 Sequence Tags (GenBank accession numbers: R27352 and R27353) have been reported that 
correspond to regions of the hKsr kinase domain. 

Comparison of mammalian and Drosophila Ksr sequences showed similarity throughout the 
kinase domain as well as at various locations within the N-terminal region (Table 1 ). Sequence 
conservation is obvious within all sub-domains of the kinase domain. Two interesting features are 

30 present within sub-domains VTb and Vm. HRDL(K/R/A)XXN (D and N are invariant residues) 
is the consensus sequence corresponding to sub-domain VIb for the majority of known kinases 
(Hardie and Hanks, 1995). Instead of an arginine at the second position, a lysine is present for the 



13 

SUBSTITUTE SHEET (RULE 26) 



WO 97/21820 



PCT/US96/19941 



Ksr homologs which distinguishes them from most other kinases. In addition, the amino acids N- 
terminal to the APE motif in sub-domain VIII. which have been implicated in substrate recognition 
specificity, (Hardie and Hanks, 1995) are well-conserved between the Ksr kinases of different 
species, but differ from those of all other kinases. One peculiarity is found in sub-domain II of the 
two mammalian proteins. This sub-domain has an invariant Jysine residue involved in the phospho 
transfer reaction that is conserved in all kinases identified thus far (Hardie and Hanks, 1995), 
however, both mammalian sequences have an arginine at this position (Table 1). It has been shown 
that mutagenesis of this lysine residue to any other residue, including arginine, abolishes catalytic 
function in several kinases (Hanks et ah, 1988). However, the sequence conservation between the 
mouse and the human kinase domains indicates that these enzymes are functional. 

Sub-domains VTb and VHI also contain conserved residues that often correlate with hydroxy 
amino acid recognition (Hanks et al., 1988). For instance, HRDLKXXN (VIb) and T/SXXY/F 
(VIII) motifs are indicative of Ser/Thr- kinases while HRDLR/AXA/RN (VIb) and PXXW (VHI) 
motifs are associated with Tyr-kinases. Based solely on these conserved residues it is not clear to 
which class Ksr kinases belong (Table 1 ). Indeed, for sub-domain VIb, the Drosophila sequences 
have an arginine residue at the critical position (like a Tyr-kinase), while the two mammalian 
sequences have a lysine residue (like a Ser/Thr-kinase). The sub-domain VHI motif for all the Ksr 
members is WXXY, which differs from that found in all other kinases. 

In the N-terminal region, four Conserved Areas (CA1 to CA4) can be recognized (Table 1). 
CA1 is a stretch of 40 amino acids located at the very N-terminus of Ksr kinases and has no 
equivalent in the database. Its conservation and the identification of a mutation in it {ksr** 48 ) 
indicate that it plays a role in Ksr function. CA2 is a proline-rich stretch followed by basic residues 
which may correspond to a class II SH3-domain binding site (PXXPXR/K; Alexandropoulos et al., 
1995), although the two fly sequences diverge from the consensus by one amino acid. CA3 is a 
cysteine-rich domain similar to the one found in other signaling molecules, such as the CR1 domain 
of Raf. Finally, CA4 is rich in serines and threonines and also contains a MAPK consensus 
phosphorylation site. 

A search of current databases indicated that the Raf kinase members are the closest relatives 
to the Ksr kinases based on sequence similarity within the kinase domain (e.g. 42% identity/61% 
similarity between the Dm Ksr and Raf kinase domains) and shared structural features in the N- 
terminal region (Table 1). Both the Raf and Ksr kinases have a related C-terminal 300 amino acid 
kinase domain, named CA5 and CR3, respectively (CR3 ; Heidecker et al., 1992). The spacing and 
sizes of the domains of the Ksr kinases are well conserved, except for the presence of an additional 
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-100 amino acids between the CA4 and CA5 domains of the Drosophila sequences. In addition, 
they both have a long N-terminal region that contains a cysteine-rich stretch followed by a 
serine/threonine-rich region, named CA3 and CA4 for Ksr kinases and CR1 and CR2 for Raf 
kinases. Ksr and Raf kinases also have distinctive features. For instance, the CA1 and CA2 regions 
found in Ksr kinases are absent from Raf kinases. The Ras-binding domain (RBD) found in the 

5 CR1 domain of Raf kinases (Nassar et al., 1995) is absent from Ksr kinases, which suggests that 
they are regulated differently. Moreover, interaction assays using the yeast two-hybrid system or 
bacterially-expressed fusion proteins, did not detect any interaction between Rasi and Ksr, while 
similar experiments detected an interaction between Rasl and the CR1 domain of D-Raf. Finally, 
amino acids in kinase sub-domain VIII, which are important for substrate recognition, are not 

10 conserved between Ksr and Raf kinases suggesting that these kinases have different targets. This 
is supported by the observation that Ksr failed to interact with Dsorl (D-MEK) in a yeast two- 
hybrid assay, whereas, D-Raf and Dsorl interacted strongly. 
Ksr functions in multiple RTK pathways. 

Recent evidence suggests that RTKs use a similar set of proteins to transduce their signals 
15 to the nucleus (see Background). Several lines of genetic evidence suggest that the Ksr kinase 
corresponds to a new component of this widely used signal transduction pathway. For instance, 
adult flies homozygous for the sub-viable allele far*** 8 have rough eyes in which ommatidia are 
missing both outer (R1-R6) and R7 photoreceptor cells. This suggests that, like Rasl (Simon et al., 
199 1), far has a broader role than just specification of the R7 cell fate. Using the FLP/FRT system 
20 (Xu and Rubin, 1993), we did not recover homozygous mutant tissue for the strong allele far* 4 *, 
which indicates that Ksr is required for cell proliferation or survival. In addition, except for the 
ksr** 4 * allele, all ksr alleles are recessive lethal and in most cases they die as third instar larvae and 
lack imaginal discs. This pbenotype is often seen with mutations in genes required for cell 
proliferation (Gatti and Baker, 1989). RNA in situ hybridization showed that far mRNA is 
25 ubiquitously distributed and is present throughout embryogenesis, consistent with a general role for 
this kinase. 

We directly tested whether far is an essential component of the Torso RTK pathway, another 
Drosophila RTK-dependent signal transduction cascade (reviewed in Duffy and Perrimon, 1994). 
Torso initiates a signal transduction cascade required for development of the anterior and posterior 
30 extremities of the embryo. As for the Sevenless RTK pathway, genetic screens aimed at elucidating 
this pathway have led to the identification of drk, sos f Rasl and genes encoding the downstream 
cassette of kinases (RaflMEKIMAPK) as being critical for signal propagation (reviewed in Duffy 
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and Perrimon, 1994). This signal transduction cascade appears to control the expression pattern of 
two genes, tailless (til) and huckebein (hkb) at the embryonic termini (reviewed in Duffy and 
Perrimon, 1994). During the cellular blastoderm stage, the posterior domain of expression of both 
factors depends uniquely on Torso-mediated signaling thereby providing excellent markers of Torso 
activity. 

Embryos derived from mothers homozygous for a torso null mutation have defective 
termini. The posterior end is missing all structures beyond the seventh abdominal segment, while 
the anterior end exhibits severe head skeleton defects (reviewed in Duffy and Perrimon. 1994). 
Consistent with these abnormalities, aberrant expression patterns are observed for til and hkb; that 
is. no til or hkb expression is detected at the posterior end, while /// expression pattern is extended 
and hkb is retracted at the anterior end. Embryos derived from germlines homozygous for loss-of- 
function mutations in general RTK components like drk, sos, Rasl or D-Raf show similar terminal 
defects, albeit to various degrees, consistent with their role in Torso RTK-mediated signaling (Hou 
et al.. 1995). 

To determine whether ksr acts in the Torso pathway, we used the FLP-FDS system (Hou 
et al.. 1995) to generate ksr grrmline clones and examined the terminal structures of embryos 
derived from homozygous mutant oocytes. Like embryos derived from Torso mutant mothers, 
cuticle preparations of ksr™* embryos revealed severe terminal defects. They are missing posterior 
structures beyond the seventh abdominal segment and have collapsed head skeletons. In addition, 
no til or hkb expression is detected at the posterior end while a broader domain of til expression and 
a reduced one for hkb is observed at the anterior extremity. These results indicate that ksr also 
functions in the Torso pathway, consistent with Ksr being a general component acting downstream 
ofRTKs. 

Activated D-Raf rescues terminal defects observed in embryos derived from germlines 
homozygous for ksr** 3 *. 

The inability of ksr mutants to suppress the sE-Raf^' phenotype in the eye suggested that 
Ksr functions upstream or in parallel to D-Raf, but not downstream. To clarify where ksr functions 
relative to D-Raf in the Torso pathway. RNA encoding an activated form of D-Raf (Raf^ 1 ) was 
injected into embryos derived from germlines homozygous for ksr" Si . If Ksr functions solely 
upstream of D-Raf then activated D-Raf should rescue the mutant phenotype. In contrast, if Ksr 
functions solely downstream of D-Raf then injection of activated D-Raf RNA should have no 
influence on the /^-associated embryonic phenotype. It is also possible that rescue might be 
observed if Ksr functions in a pathway parallel to D-Raf and can be bypassed by activation of D-Raf 
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to sufficiently high levels. Injection of activated D-Raf partially rescued the iur* "'-associated 
embryonic terminal defects. These results confirm that Ksr does not act downstream of D-Raf. 

Experimental Procedures: 

Fly culture and crosses were performed according to standard procedures. Clonal analysis 
in the eye was performed on the ksr*' 638 allele (the strongest suppressor of sev-Rasl vl2 among the 
ksr alleles) using the FLP/FRT system (Xu and Rubin, 1993). 

ksr 3 638 germline clones were generated as described in Hou et al. (1995). Cuticle 
preparation of embryos was performed as described in Belvinetal. (1995). In situ hybridization 
was performed according to Dougan and DiNardo (1992) using digoxigenin-labelled RNA probes. 
Injection of embryos was performed as described in Anderson and Nusslein-Volhard (1984). An 
in vitro trancription kit (Promega) was used to synthesize activated D-Raf RNA from the Raf 0 * 4021 
DNA template (Dickson et al., 1992). 

Scanning electron microscopy was performed as described by Kimmel et al. (1990). 
Fixation and sectioning of adult eyes were performed as described by Tomlinson and Ready ( 1 987). 

The fiGGT-I locus was recovered from a chromosome walk initiated by screening a cosmid 
library (Tamkun et al M 1992) with a genomic fragment flanking a P-element [1(2)05714] inserted 
at 25B4-6 (Karpen and Spradling, 1992; Berkeley Drosophila Genome Project, pers. comm.). A 
1.7 kb Spel-Sphl genomic fragment spanning the S-2126 allele inversion breakpoint was used to 
screen a Drosophila eye-antennal imaginal disc cDNA library in XgtlO. Sixteen related cDNA 
clones were isolated from -700,000 pfu screened. 

The ksr gene was isolated from a chromosome walk. Genomic blot analysis of X-ray- 
induced ksr alleles was performed according to standard procedures (Sambrook et al.. 1989). The 
2.9 kb and 2.2 kb BamHI fragments from cosmid m identified polymorphisms in the 5-69 and S- 
511 alleles, respectively. A 7 kb EcoRI genomic fragment encompassing all of the 2.9 kb BamHI 
fragment and part of the 2.2kb BamHI fragment was used along with the 2.2kb BamHI fragment 
to screen -700,000 phage from a Drosophila eye-antennal imaginal disc cDNA library in AgtlO. 
Seven related cDNA clones were isolated and characterized by sequencing. 

A D. virilis genomic library was screened at reduced stringency using the Dm Ksr kinase 
domain as a probe. In brief, filters were hybridized in 5X SSCP; 10X Denhart; 0.1% SDS; 200 
Hg/ml sonicated salmon sperm DNA at 42°C for 12 hrs, rinsed several times at room temperature 
and washed twice for 2hrs at 50°C in IX SSC: 0.1% SDS. 12 genomic clones were identified; one 
was purified and analyzed by sequencing. 

A DNA fragment corresponding to the hb DNA sequence was prepared by PCR from a 
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mouse brain cDNA library and used as a probe to screen a mouse PCC4 teraiocarcinoma cDNA 
library (Stratagene). One full-length cDNA clone, named mKsr- 1 . was obtained from 1 X 10 6 pfu 
screened. Using the mKsr-1 kinase domain as a probe, 1 X 10* pfu of a human fetal brain cDNA 
library (Clontech) was hybridized at reduced stringency (see above). Thirteen related cDNA clones 
were isolated and characterized by sequencing. They all represent partial transcripts and only one 
of them, named hKsr-1, has a complete kinase domain. 

DNA sequences were performed by the dideoxy chain teraiination procedure (Sanger et al., 
1977) using the Automated Laser Fluorescence (ALF) system (Pharmacia). Templates were 
prepared by sonicating plasmid DNA and inserting the sonicated DNA into the M13rapl0 vector. 
The entire coding regions of BGGT-I and Ksr cDNAs from each species were sequenced on both 
strands as well as the genomic regions that correspond to \htfiGGT-l and Dm ksr loci. Sequences 
were analysed using the Staden (R. Staden. MRC of Molecular Biology, Cambridge UK) and the 
Generics Computer Group, Inc. software packages. The chromosomal regions for different j£JGGr-/ 
and ksr mutant alleles were cloned into the AjZAP-express vector (Stratagene) and their respective 
coding regions were completely sequenced using oligonucleotide primers. 
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30 Pharmaceutical lead compound screening assays. 

1 . Protocol for Ksr - substrate phosphorylation assay. 
A. Reagents: 
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- Neutralite Avirtin- 20 ug/ml in PBS. 
-hKsr. lO" 8 - 10 s M hKsr at 20 ng/ml in PBS. 

- Blocking buffer : 5% BSA, 0.5% Tween 20 in PBS; 1 hour at room temperature. 

- Assay Buffer 100 mM KG, 20 mM HEPES pH 7.6, 0.25 mM EDTA, 1% glycerol, 0.5% 
NP-40, 50 mM BME, 1 mg/ml BSA, cocktail of protease inhibitors. 

-IfElT-ATP 10x stock: 2x 10" 5 M cold ATP with 100 uCi [ M P] Y -ATP. Place in the 4°C 
microfridge during screening. 

- Substrate : 2 x 10*M biotinylated synthetic peptide kinase substrate (MBP, Sigma) at 20 
Mg/ml in PBS. 

- Protease inhibitor cocktail qOOOX) : 10 mg Trypsin Inhibitor (BMB # 109894), 10 mg 
Aprotinin (BMB # 236624), 25 mg Benzamidine (Sigma # B-6506), 25 mg Leupeptin (BMB # 

1017128), 10nigAPMSF(BMB#917575),and2mMNaVo 3 (Sig m a#S^508)in lOmlofPBS. 

B. Preparation of assay plates: 

- Coat with 120 pi of stock Neutralite avidin per well overnight at 4°C. 

- Wash 2 times with 200 \il PBS. 

- Block with 150 fil of blocking buffer. 

- Wash 2 times with 200 fil PBS. 

C. Assay: 

- Add 40 til assay buffer/well. 

- Add 40 nl hKsr (0.1-10 pmoles/40 ul in assay buffer) 

- Add 10 \xl compound or extract 

- Shake at 30°C for 15 minutes. 

- Add 10 jil pPJy-ATP lOx stock. 

- Add 10 pi substrate. 

- Shake at 30°C for 15 minutes. 

- Incubate additional 45 minutes at 30°C. 

- Stop the reaction by washing 4 times with 200 ul PBS. 

- Add 150 fil scintillation cocktail. 
• Count in Topcount. 

> Controls for all assays (located on each plate): 

a. Non-specific binding (no hKsr added) 

b. cold ATP to achieve 80% inhibition. 
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2. Protocol for hKsr - Raf binding assay. 

A. Reagents: 

- Anti-mvc antibody : 20 ng/ml in PBS. 

- Blocking buffer 5% BSA, 0.5% Tween 20 in PBS; 1 hour at room temperature. 

- Assay Buffer : 100 mM KC1, 20 mM HEPES pH 7.6, 0.25 mM EDTA, 1% glycerol, 0.5% 
NP-40, 50 mM P-mercaptoethanol, 1 mg/ml BS A, cocktail of protease inhibitors. 

- ^P hKsr lOx stock : 10* - 10* 6 M "cold" hKsr (full length) supplemented with 200,000- 
250,000 cpm of labeled hKsr (HMK-tagged) (Beckman counter). Place in the 4°C microfridge 
during screening. 

- Protease inhibitor cocktail HOOOX) : 10 mg Trypsin Inhibitor (BMB # 109894), 10 mg 
Aprotinin (BMB # 236624), 25 mg Benzamidine (Sigma # B-6506), 25 mg Leupeptin (BMB # 
1017128), 10 mgAPMSF (BMB #917575), and2mM NaVo 3 (Sigma # S-6508) in 10 ml of PBS. 

- Raf: 1(T 8 - 10* 5 M myc eptitope-tagged Raf in PBS. 

B . Preparation of assay plates: 

- Coat with 120 pi of stock anti-myc antibody per well overnight at 4°C. 
-Wash2X with 200 jil PBS. 

- Block with 150 pi of blocking buffer. 

- Wash 2X with 200 pi PBS. 

C. Assay: 

- Add 40 pi assay buffer/well. 

- Add 10 \il compound or extract. 

- Add 10 pi 33 P-hKsr (20,000-25,000 cpm/0.1-10 pmoles/weil =10" 9 - 10" 7 M final 
concentration). 

- Shake at 25°C for 15 minutes. 

- Incubate additional 45 minutes at 25 °C. 

- Add 40 |il eptitope-tagged Raf (0.1-10 pmoles/40 ul in assay buffer) 

- Incubate 1 hour at room temperature. 

- Stop the reaction by washing 4 times with 200 jai PBS. 

- Add 150 |il scintillation cocktail. 

- Count in Topcount. 

D. Controls for all assays (located on each plate): 

a. Non-specific binding (no hKsr added) 

b. Soluble (non-tagged Raf) to achieve 80% inhibition. 
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All publications and patent applications cited in this specification are herein incorporated 
by reference as if each individual publication or patent application were specifically and 
individually indicated to be incorporated by reference. Although the foregoing invention has been 
described in some detail by way of iliustrauon and example for purposes of clarity of understanding, 
it will be readily apparent to those of ordinary skill in the art in light of the teachings of this 
5 invention that certain changes and modifications may be made thereto without departing "from the 
spirit or scope of the appended claims. 

SEQUENCE LISTING 

SEQ ID NO: 1 cDNA sequence of Drosophila melanogaster Ksr 

SEQ ID NO: 2 amino acid sequence of Drosophila melanogaster Ksr 
10 SEQ ID NO: 3 genomic sequence of Drosophila virilis Ksr 

SEQ ID NO: 4 amino acid sequence of Drosophila virilis Ksr 

SEQ ID NO: 5 cDNA sequence of Mus musculus Ksr 

SEQE ID NO: 6 amino acid sequence of Mus musculus Ksr 

SEQ ID NO: 7 cDNA composite sequence of human Ksr 
15 SEQ ID NO: 8 amino acid composite sequence of human Ksr 

SEQ ID NO: 9 cDNA sequence of human Ksr' 

SEQ ID NO: 10 amino acid sequence of human Ksr' 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Rubin, Gerry M. 

Therrien. Marc 
Chang, Henry C. 
5 Karim, Felix D. 

Wassarman, David A. 
(ii) TITLE OF INVENTION: A Novel Protein Kinase Required for Ras 
Signal Transduction 
(iii) NUMBER OF SEQUENCES: 12 
10 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 

(B) STREET: 268 BUSH STREET, SUITE 3200 

(C) CITY: SAN FRANCISCO 

(D) STATE: CALIFORNIA 
15 (E) COUNTRY: USA 

(F> ZIP: 94104 
(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 
20 (C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 
(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 
25 (C) CLASSIFICATION : 

(viii) ATTORNEY/AGENT INFORMATION: 
(A) NAME: OSMAN, RICHARD A 
<B) REGISTRATION NUMBER: 36.627 

(C) REFERENCE /DOCKET NUMBER: B96-010 
30 (ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 343-4341 

(B) TELEFAX: (415) 343-4342 

(2) INFORMATION FOR SEQ ID NO:l: 
35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3697 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
40 (ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GAATTCCAAT TATTGCTTTT TCGCATTGCC TAAGCCGTTT AGAGTTGCGG GCGTTAGCGT 
GCGCGATAGC CGGAGCACCG AACGTCAAGG TCGCTTGGCG AGGGCCACAA TGCGGGGCGG 
AGTCCCAGCC ATTGGTCCCA TCGAATCGTC GAGTCCCCGA GAGGGCGTCT GAAAAAATCA 
45 ATCGGGCTCC ACTCCGTCGC GAATAAGCAG GATGAGCAGC AACAACAACG CACCCGCATC 



60 
120 
180 
240 
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GGCTCCAGAC ACGGGCTCCA CCAATGCCAA CGATCCCATC TCCGGTTCGC TGTCCGTAGA 300 

CAGCAACCTG GTTATCATTC AGGACATGAT TGATCTCTCG GCCAACCATC TGGAGGGCCT 360 

GCGAACGCAG TGCGCGATCA GCTCCACGCT GACGCAGCAG GAGATTCGTT GCCTGGAGTC 420 

GAAGCTGGTG CGATACTTCT CCGAGCTGCT GCTGGCGAAG ATGCGGCTAA ATGAGCGCAT 480 

CCCGGCCAAC GGGCTTGTGC CCCACACAAC GGGCAACGAA CTGAGGCAAT GGCTGCGCGT 540 

AGTGGGCCTT AGCCAGGGGA CTCTTACCGC CTGCCTTGCT CGCCTGACCA CTCTAGAGCA 600 

AAGCCTGCGT CTCAGCGACG AGGAGATCCG TCAACTCCTG GCTGACAGCC CCAGCCAGCG 660 

AGAGGAGGAG GAACTGCGAC GCCTGACCAG GGCCATGCAG AACTTAAGGA AGTGCATGGA 720 

GTCGCTGGAG AGCGGTACTG CGGCTAGCAA CAACGATCCA GAGCAGTGGC ACTGGGACTC 780 

CTGGGACAGG CCCACCCACA TTCATCGCGG CAGTGTGGGA AACATTGGAC TGGGTAACAA 840 

TTCAACCGCC TCCCCGAGAA CCCATCATCG CCAGCATGGT GTCAAGGGAA AGAATTCCGC 900 

TCTGGCCAAC TCCACCAACT TCAAAAGTGG CCGCCAATCG CCCTCAGCGA CAGAAGAGCT 960 

GAACAGCACA CAGGGTTCCC AGCTGACTTT AACCCTTACG CCCTCGCCAC CCAATTCGCC 1020 

CTTCACGCCT TCCAGTGGGC TGAGCAGCAG CCTTAATGGA ACACCACAGA GGAGTCGTGG 1080 

TACCCCGCCG CCAGCCAGAA AGCACCAGAC CTTGCTGAGC CAGAGTCATG TGCAAGTGGA 1140 

CGGGGAGCAA TTAGCCCGCA ACCGTTTGCC CACTGATCCC AGCCCCGATA GCCACAGCTC 1200 

CACCAGCTCG GACATCTTTG TGGACCCAAA TACTAATGCC AGCTCCGGAG GAAGTTCCTC 1260 

GAACGTGCTT ATGGTGCCAT GCTCTCCGGG CGTGGGTCAC GTGGGCATGG GTCATGCAAT 1320 

CAAGCATCGT TTCACCAAGG CCCTGGGCTT CATGGCCACC TGTACCCTGT GCCAGAAGCA 1380 

GGTCTTTCAC CGCTGGATGA AGTGCACCGA CTGCAAGTAC ATCTGCCACA AGTCATGCGC 1440 

ACCGCACGTA CCGCCCTCCT GTGGACTTCC ACGAGAATAT GTGGACGAGT TTCGGCACAT 1500 

AAAGGAGCAG GGAGGATACG CCAGTCTGCC GCATGTGCAT GGCGCGGCGA AAGGATCCCC 1560 

TTTGGTAAAA AAGAGCACCC TGGGTAAGCC CTTGCATCAG CAGCACGGCG ATAGCAGTTC 1620 

GCCGAGTTCC AGCTGCACTA GTTCCACGCC CAGCAGTCCG GCGCTGTTCC AGCAAAGGGA 1680 

GCGCGAGCTG GATCAGGCGG GCAGCAGCTC TAGCGCCAAT CTGTTACCTA CGCCTTCGCT 1740 

TGGCAAGCAC CAGCCGAGTC AATTCAACTT TCCCAACGTG ACGGTGACGA GCAGTGGCGG 1800 

AAGCGGTGGT GTATCGCTCA TCTCCAATGA ACCAGTGCCA GAGCAATTCC CCACGGCGCC 1860 

TGCAACAGCC AACGGAGGAC TTGATAGTCT GGTGAGCAGC TCCAACGGGC ACATGAGCTC 1920 

GCTCATCGGT AGCCAAACTT CAAACGCTTC TACTGCGGCC ACCTTGACGG GCAGTCTGGT 1980 

CAATAGCACA ACCACCACCA GCACCTGCAG TTTCTTTCCG CGAAAATTGA GCACAGCCGG 2040 

TGTGGATAAG AGGACGCCGT TCACCAGCGA GTGCACGGAT ACCCACAAGT CAAATGACAG 2100 

CGACAAGACA GTCTCCTTGT CTGGAAGTGC CAGCACGGAC TCGGACCGGA CACCCGTTCG 2160 

TGTGGATTCA ACGGAAGACG GAGACTCGGG ACAATGGCGA CAGAACTCGA TCTCACTCAA 2220 

GGAATGGGAC ATCCCGTATG GTGATCTGCT TCTGCTCGAG CGGATAGGGC AGGGACGCTT 2280 

CGGCACCGTG CATCGAGCCC TTTGGCACGG AGATGTGGCG GTTAAGCTGC TCAACGAGGA 2340 

CTATCTGCAA GACGAACACA TGCTGGAGAC GTTTCGCAGC GAGGTAGCCA ACTTCAAGAA 2400 

CACTCGACAC GAGAACCTGG TGCTGTTCAT GGGAGCCTGC ATGAACCCAC CATATTTGGC 2460 

CATTGTGACT TCATTGTGCA AGGGCAACAC CTTGTATACG TATATTCACC AGCGTCGGGA 2520 

GAAGTTTGCC ATGAACCGGA CTCTCCTCAT TGCCCAGCAG ATCGCCCAGG GCATGGGCTA 2580 

CCTGCACGCA AGGGAGATCA TCCACAAAGA TCTGCGCACC AAGAACATCT TCATCGAGAA 2640 

CGGCAAGGTG ATTATCACGG ACTTTGGGCT GTTCAGCTCC ACCAAGCTGC TCTACTGTGA 2700 

TATGGGCCTA GGAGTGCCCC ACAACTGGTT GTGCTACCTG GCGCCGGAGC TAATCCGAGC 2760 

ATTGCAGCCG GAGAAGCCGC GTGGAGAGTG TCTGGAGTTC ACCCCATACT CCGATGTCTA 2820 

CTCTTTCGGA ACCGTTTGGT ACGAGCTAAT CTGCGGCGAG TTCACATTCA AGGATCAGCC 2880 

GGCGGAATCG ATCATCTGGC AGGTTGGCCG TGGGATGAAG CAGTCGCTGG CCAACCTGCA 2940 

GTCTGGACGG GATGTCAAGG ACTTGCTGAT GCTGTGCTGG ACCTACGAGA AGGAGCACCG 3000 
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GCCGCAGTTC 


GCACGCCTGC 


TCTCCCTGCT 


GGAGCATCTT 


CCCAAGAAGC 


GTCTGGCGCG 


3060 


CAGTCCCTCC 


CACCCCGTCA 


ACCTTTCCCG 


TTCCGCCGAG 


TCCGTGTTCT 


GAGGGAACTG 


3120 


CAGCATGGCC 


ACTGTCACTG 


TCTAGTACAA 


TTTCGATCTA 


CCAACTAAGC 


TAGCTCGCTT 


3180 


TGTGCCCTCG 


TCCACTCTAC 


ACAAACTCTC 


TCCCAAGGCG 


AAGTTCTATC 


GAGCCGAGCG 


3240 


AAGATTGTAA 


ATACATAAAC 


GTAACTACCA 


AATTATAGCA 


ATCCATTTTA 


AAAACTACAT 


3300 


APATATGTGT 


AGGCATGTAT 


CGGGAGCACT 


CCAGTTGCAG 


TTGTTAGCAA ACGAAACAAA 


3360 


GGCAAATCAA 


ATGTTAACTC 


UAAAAAviACA 


AAACGCTTAA 


ATGTTTAAGA 


GCAGAGGCAA 




ACAGAGAAGG 


CATAGACATA 


CATATACAAA 


CAAACAAACA 


AGCACTGTGG 


CAAACATAAA 


3480 


TGTAAACGTT 


AATCAGGTGA 


GCAATTTCTA 


AATTGTTAAT 


TATGTGTAAG 


AGAACTATAT 


3540 


ATATATATAT 


ATATATATAT 


ATATATATAT 


ATATACATGT 


ATATACAGCA 


GCAATGTATT 


3600 


GTATATGACG 


GACTAGTGTT 


AAATTAAATA 


TATATTGTGA ATTATGTATG GTCAAGTGTA 


3660 


TATAGTAAAT 


GGACTTTAAA 


TGCGAAATCG 


GGAATTC 






3697 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 966 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 
(ii) MOLECULE TYPE: peptide 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ser Ser Asn Asn Asn Ala Pro Ala Ser Ala Pro Asp Thr Gly Ser 
15 10 15 

Thr Asn Ala Asn Asp Pro lie Ser Gly Ser Leu Ser Val Asp Ser Asn 
20 25 30 

25 Leu Val He He Gin Asp Met He Asp Leu Ser Ala Asn His Leu Glu 

35 40 45 

Gly Leu Arg Thr Gin Cys Ala He Ser Ser Thr Leu Thr Gin Gin Glu 

50 55 60 

He Arg Cys Leu Glu Ser Lys Leu Val Arg Tyr Phe Ser Glu Leu Leu 
30 65 70 75 80 

Leu Ala Lys Met Arg Leu Asn Glu Arg He Pro Ala Asn Gly Leu Val 

85 90 95 

Pro His Thr Thr Gly Asn Glu Leu Arg Gin Trp Leu Arg Val Val Gly 
100 105 110 

35 Leu Ser Gin Gly Thr Leu Thr Ala Cys Leu Ala Arg Leu Thr Thr Leu 

115 120 125 

Glu Gin Ser Leu Arg Leu Ser Asp Glu Glu He Arg Gin Leu Leu Ala 

130 135 140 

Asp Ser Pro Ser Gin Arg Glu Glu Glu Glu Leu Arg Arg Leu Thr Arg 
40 145 150 155 160 

Ala Met Gin Asn Leu Arg Lys Cys Met Glu Ser Leu Glu Ser Gly Thr 

165 170 175 

Ala Ala Ser Asn Asn Asp Pro Glu Gin Trp His Trp Asp Ser Trp Asp 
180 185 190 

45 Arg Pro Thr His He His Arg Gly Ser Val Gly Asn He Gly Leu Gly 
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200 205 
Asn Asn Ser Thr Ala Ser Pro Arg Thr His His Arg Gin His Gly Val 

21 ° 215 220 

Lys Gly Lys Asn Ser Ala Leu Ala Asn Ser Thr Asn Phe Lys Ser Gly 
225 230 235 240 

Arg Gin Ser Pro Ser Ala Thr Glu Glu Leu Asn Ser Thr Gin Gly Ser 

245 250 255 

Gin Leu Thr Leu Thr Leu Thr Pro Ser Pro Pro Asn Ser Pro Phe Thr 

260 265 270 

Pro Ser Ser Gly Leu Ser Ser Ser Leu Asn Gly Thr Pro Gin Arg Ser 

275 280 285 

Arg Gly Thr Pro Pro Pro Ala Arg Lys His Gin Thr Leu Leu Ser Gin 

290 295 300 

Ser His Val Gin Val Asp Gly Glu Gin Leu Ala Arg Asn Arg Leu Pro 
305 310 315 320 

Thr Asp Pro Ser Thr Asp Ser His Ser Ser Thr Ser Ser Asp lie Phe 

325 330 335 

Val Asp Pro Asn Thr Asn Ala Ser Ser Gly Gly Ser Ser Ser Asn Val 

340 345 350 

Leu Met Val Pro Cys Ser Pro Gly Val Gly His Val Gly Met Gly His 

355 360 365 

Ala He Lys His Arg phe Thr Lys Ala Leu Gly Phe Met Ala Thr Cys 

370 375 380 

Thr Leu Cys Gin Lys Gin Val Phe His Arg Trp Met Lys Cys Thr Asp 
385 3*0 395 400 

Cys Lys Tyr He Cys His Lys Ser Cys Ala Pro His Val Pro Pro Ser 

405 410 415 

Cys Gly Leu Pro Arg Glu Tyr Val Asp Glu Phe Arg His He Lys Glu 

420 425 430 

Gin Gly Gly Tyr Ala Ser Leu Pro His Val His Gly Ala Ala Lys Gly 

435 440 445 

Ser Pro Leu Val Lys Lys Ser Thr Leu Gly Lys Pro Leu His Gin Gin 

450 455 460 

His Gly Asp Ser Ser Ser Pro Ser Ser Ser Cys Thr Ser Ser Thr Pro 
465 470 475 480 

Ser Ser Pro Ala Leu Phe Gin Gin Arg Glu Arg Glu Leu Asp Gin Ala 

485 490 495 

Gly Ser Ser Ser Ser Ala Asn Leu Leu Pro Thr Pro Ser Leu Gly Lys 

500 505 510 

His Gin Pro ser Gin Phe Asn Phe Pro Asn Val Thr Val Thr Ser Ser 

515 520 525 

Gly Gly Ser Gly Gly Val Ser Leu He Ser Asn Glu Pro Val Pro Glu 

53 0 535 540 

Gin Phe Pro Thr Ala Pro Ala Thr Ala Asn Gly Gly Leu Asp Ser Leu 
545 550 555 * seo 

Val Ser Ser Ser Asn Gly His Met Ser Ser Leu He Gly Ser Gin Thr 
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565 570 575 

Ser Asn Ala Ser Thr Ala Ala Thr Leu Thr Gly Ser Leu Val Asn Ser 

580 585 590 

Thr Thr Thr Thr Ser Thr Cys Ser Phe Phe Pro Arg Lys Leu Ser Thr 
595 600 605 

5 Ala Gly Val Asp Lys Arg Thr Pro Phe Thr Ser Glu Cys Thr Asp Thr 

610 615 620 

His Lys Ser Asn Asp Ser Asp Lys Thr Val Ser Leu Ser Gly Ser Ala 
625 630 635 640 

Ser Thr Asp Ser Asp Arg Thr Pro Val Arg Val Asp Ser Thr Glu Asp 
10 645 650 655 

Gly Asp Ser Gly Gin Trp Arg Gin Asn Ser lie Ser Leu Lys Glu Trp 

660 665 670 

Asp lie Pro Tyr Gly Asp Leu Leu Leu Leu Glu Arg He Gly Gin Gly 
675 680 685 

15 Arg Phe Gly Thr Val His Arg Ala Leu Trp His Gly Asp Val Ala Val 

690 695 700 

Lys Leu Leu Asn Glu Asp Tyr Leu Gin Asp Glu His Met Leu Glu Thr 
705 710 715 720 

Phe Arg Ser Glu Val Ala Asn Phe Lys Asn Thr Arg His Glu Asn Leu 
20 725 730 735 

Val Leu Phe Met Gly Ala Cys Met Asn Pro Pro Tyr Leu Ala He Val 

740 745 750 

Thr Ser Leu Cys Lys Gly Asn Thr Leu Tyr Thr Tyr He His Gin Arg 
755 760 765 

25 Arg Glu Lys Phe Ala Met Asn Arg Thr Leu Leu He Ala Gin Gin He 

770 775 780 

Ala Gin Gly Met Gly Tyr Leu His Ala Arg Glu He He His Lys Asp 
785 790 795 BOO 

Leu Arg Thr Lys Asn He Phe He Glu Asn Gly Lys Val He He Thr 
30 805 810 815 

Asp Phe Gly Leu Phe Ser Ser Thr Lys Leu Leu Tyr Cys Asp Met Gly 

820 825 830 

Leu Gly Val Pro His Asn Trp Leu Cys Tyr Leu Ala Pro Glu Leu He 
835 840 845 

35 Arg Ala Leu Gin Pro Glu Lys Pro Arg Gly Glu Cys Leu Glu Phe Thr 

850 855 860 

Pro Tyr Ser Asp Val Tyr Ser Phe Gly Thr Val Trp Tyr Glu Leu He 
865 870 875 880 

Cys Gly Glu Phe Thr Phe Lys Asp Gin Pro Ala Glu Ser He He Trp 
40 885 890 895 

Gin Val Gly Arg Gly Met Lys Gin Ser Leu Ala Asn Leu Gin Ser Gly 

900 905 910 

Arg Asp Val Lys Asp Leu Leu Met Leu Cys Trp Thr Tyr Glu Lys Glu 
915 920 925 

45 His Arg Pro Gin Phe Ala Arg Leu Leu Ser Leu Leu Glu His Leu Pro 
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60 



240 

300 



930 935 940 

Lys Lys Arg Leu Ala Arg Ser Pro Ser His Pro Val Asn Leu Ser Arg 
945 550 955 960 

Ser Ala Glu Ser Val Phe 
965 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 36B1 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(ii> MOLECULE TYPE: UNA (genomic) 
<Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCCCCAAAAA CTATAAAATT TTTCGCGTTT TTCTCATAGC AGAAGCTGTC TCGAAGTCCG 
CATTTCGCAG GACTGTTCAT GTGTGCTTCC AGCAAGCGAA AAAAGCTGGT TGATGTGGAC 120 
AGAATGTGTG TCAAAGTGGT GCAAACAACA AATGATTTGT AAGTGCGTCT GAAAAAATCA 180 
ATCAGTTTGT ACTGCTGGAA GGGGCGGGCG GGCCACAACA AAATGAGCAG CAGCGCCGCC 
GCCCAGCTGA CTGCGCCGCC AGTCAGCAAC AGCAACAGCA GCAGCAGTAA CAACAATACA 
ACAACGACTG CGAGCGAAAG CAATCTAATC ATCATACAGG ATATGATTGA TCTCTCGGCC 360 
AACCATCTGG AGGGTCTGCG AACACAGTGC GCAACGAGCG CGACGTTGAC GCAACAGGAG 420 
ATCCGCTGCC TAGAGTCCAA GTTGGTGCGC TACTTCTCCG AACTGCTCTT GACCAAAACG 480 
AGACTCAACG AACGCATACC CGCGAACGGT CT GC T GC CCC ATCATCAGGC TACCGGGAAC 540 
GAGTTGCGCC AATGGCTGCG AGTAGTTGGA CTCAGTCCGG AGTCACTGAA TGCATGCCTA 600 
GCGCGTCTAA CGACATTGGA GCAAACACTG CAGCTGAGCG ATGAAGAACT GAAACAACTG 660 
CTTGCCCACA ATTCAAGTAC CCAGCTCGAC GAGGAACTGC GGCGGCTGAC CAAAGCGATC 720 
CATAATCTCC GAAAATGCAT GGAAACGCTG GACAGCAGCG GCGCAGTTGC GTCCAACGTC 780 
GATCCGGAAC AATGGCACTG GGACTCCTGG GATCGACCCC ATCCGCATCA CATGCACCGC 840 
GGCAGCATTG GCAATATTGG CCTAGGACTA AGCAGCGCCT CACCTCGCGC CCATCATCGT 900 
CAACATCAAC ATCAACACGC GAACAGCAAG CCGAAAATTG TTAACAATTC TGCCTCAAGC 960 
TCCCGCAGCG AACAGCAACC ACTGACTGGT TCTCAGTTGA CCTTAACACT GACGCCCTCG 1020 
CCACCCAACT CGCCCTTTAC GCCCGCCTCA GGGACGGCAT CCGCCAGCGG CACTCCGCAG 1080 
CGCAGCCGCA GTACCACAAC AGCGGCGGGA ACGCCACCAC CAGCCAAGAA GCATCAAACG 1140 
CTGCTCATGC ACAACAGCAG CGCTTCGGAA ACGGCACTCG CGGAGCAGCC TCCACGGCCA 1200 
CCGCGCAGCC GTCTACCCAC AGATCCTAGC CCGGATAGCC ACAGCTCGGC CAGCAGTTCG 1260 
GACATTTTTG TGGACGGTGG CAGTATCAAC AGCTCCAATG TACTACTAGT GCCGCCCTCG 1320 
CCAGGTGTGG CACACGTGGG CATGGGTCAT ACCATTAAGC ACCGTTTCAG TAAATGGTTT 1380 
GGCTTCATGG CCACGTGCAA ACTGTGCCAA AAGCAGATGA TGAGCCACTG GTTCAAGTGC 1440 
ACCGACTGCA AATATATTTG CCACAAGTCC TGTGCGCCGC ATGTGCCGCC CTCGTGTGGC 1500 
CTTCCACCCG AATATGTTCA CGAGTTTCGT CAAACTCAGG TGGGCGGCAG ATGGGACCCT 1560 
GCGCAGCACA GCAGCAGCAA GGCATCACCA GTGCCCAGGA AGAGCACGCT GGGCAAACCG 1620 
CAATTGCAGC AGCCACAGCT GCAGCACGGG GACAGCAGCT CACCAAGCTC GAGCTGCACC 1680 
AGCTCAACGC CCAGCAGTCC AGCATTGTTC CAGCAGCAGC AACTGCAACT GGCCACGCCC 1740 
AGCGCCTGCC AGCCGAAACC AGCACCAGCA GCGGTAGCAG CAGCAGCAAC ACAACAGGGT 1800 
CAACAGAGTC AATTCAATTT CCCCAACGTG ACCATCACAA GCATCAATGC CTGCAATAGT 1860 
AACGCCAGCG CTGCCCAAAC GCTCATATCC AATGAGCCGC AAGCGCATAT GGCCACAACG 1920 
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GAGTCCACGC TGACCAATGG CAACAACAAC AGCAGCTCCA ACAACGGGAG CAGCGCCAAC 1980 

AACAATAGCA GCAGCAGCAG CAGCTGCTCC AATGGTCACC TGCACTCGCT GACTGGAAGT 2040 

CAAGTGTCCA CGCATTCGGC TACCTCGCAA GTGTCGAATG TCAGTGGCAG CAGCTCGGCC 2100 

ACCTACACCT CCAGTCTGGT GAACAGCGGC AGTTTCTTTC CGCGGAAATT GAGCAATGCT 2160 

GGCGTGGACA AGCGGGTGCC CTTTACCAGC GAATATACGG ACACGCACAA GTCGAATGAT 2220 

5 AGCGACAAGA CGGTTTCGTT GTCGGGCAGC GCCAGCACTG ACTCGGATCG CACGCCTGTG 2280 

CGTTTGGACT CCACAGAGGA TGGCGACTCG GGCCAATGGC GGCAGAACTC CATATCATTG 2340 

AAGGAATGGG ATATACCCTA TGGCGATTTG CACTTGCTGG AGCGCATTGG ACAGGGTCGA 2400 

TTTGGCACCG TGCATCGGGC ACTGTGGCAT GGCGATGTCG CTGTGAAGCT GCTCAATGAA 2460 

GACTATCTGC AGGACGAGCA CATGCTGGAA TCGTTTCGCA ACGAGGTGGC CAATTTCAAG 2520 

10 AAGACGCGAC ACGAGAATCT GGTGCTGTTC ATGGGCGCCT GCATGAATCC GCCGTATTTG 2580 

GCCATTGTCA CGGCACTATG CAAGGGCAAC ACCCTGTACA CCTATATACA TCAGCGAAGG 2640 

GAGAAGTTTG CAATGAATCG CACGTTGTTG ATTGCCCAAC AGATTGCCCA GGGCATGGGC 2700 

TATTTGCATG CCAGGGACAT AATACACAAG GATCTGCGCA CCAAGAACAT TTTTATAGAG 2760 

AATGGCAAGG TGATCATTAC GGACTTTGGC CTATTCAGCT CCACAAAGCT GCTGTACTGT 2820 

15 GATATGGGCT TGGGTGTTCC ACAAAACTGG CTCTGCTACC TGGCCCCGGA ACTAATACGC 2880 

GCCCTGCAGC CGTGCAAGCC ACCCGGCGAG TGTCTAGAGT TCACGTCCTA CTCGGATGTT 2940 

TACTCATTTG GCACCGTTTG GTACGAGCTA ATTTGCGGCG AATTCACGTT CAAGGATCAA 3000 

CCGGCGGAGT CAATCATTTG GCAAGTGGGG CGCGGCATGA AACAGTCGCT GGCCAATCTG 3060 

CAGTCTGGTC GTGATGTCAA GGACCTGCTG ATGCTGTGCT GGACCTATGA AAAGGAGCAC 3120 

20 AGGCCGGACT TTGCACGTCT GCT C TCCT TG CTGGAGCATT TGCCAAAGAA GCGCCTGGCA 3180 

CGCAGTCCCT CGCATCCTGT CAACCTCTCG CGCTCAGCGG AATCTGTATT CTAACCAGCC 3240 

GATATACAAA TATATACGTT TATAGACAAA TATGTCATAT ATGTAAGCAG GCGCGCACAC 3300 

ACTCACACAC ACACACACTC TATTTAGCAC AATTTCACGT TATATGTAAA TGTAAGCTAC 3360 

ACACATATGC AAACATACGT ATGTCACTTT AACTGTAATT GTTGTGCGTG CAAAATGTCA 3420 

25 AATGTGAAAT TAGCTCTCCG GTAAGGGAAG CAAGAGAATG CGGAGAGCAA AGCTCACTTC 3480 

CTCAGCCTCA TGTATGTGTA TGTATGTGTA CGACCCTACG ACTCTCAAAG AAAAGTTCAA 3540 

AGTGCATGTG TTACAAAACA AAAAACTGTA AATATACATT TAAAGCAAAT GAAACGAAAC 3600 

TATACATATA TGTGTATATC CAATTATAGC AATTTACAAA TGCATTGTCA AAATAGTTTT 3 660 

TATCTTTAAT TATGTATTGA A 3681 



30 



(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 



35 



(A) LENGTH: 1003 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 



40 



(D) TOPOLOGY : not relevant 
(ii) MOLECULE TYPE: peptide 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ser Ser Ser Ala Ala Ala Gin Leu Thr Ala Pro Pro Val Ser Asn 

15 10 15 

Ser Asn Ser Ser Ser Ser Asn Asn Asn Thr Thr Thr Thr Ala Ser Glu 



20 25 30 



Ser Asn Leu lie He He Gin Asp Met He Asp Leu Ser Ala Asn His 



35 40 45 



45 



Leu Glu Gly Leu Arg Thr Gin Cys Ala Thr Ser Ala Thr Leu Thr Gin 
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50 55 60 

Gin Glu lie Arg Cys Leu Glu Ser Lys Leu Val Arg Tyr Phe Ser Glu 
65 70 75 80 

Leu Leu Leu Thr Lys Thr Arg Leu Asn Glu Arg lie Pro Ala Asn Gly 
85 90 95 

5 Leu Leu Pro His His Gin Ala Thr Gly Asn Glu Leu Arg Gin Trp Leu 

100 105 110 

Arg Val Val Gly Leu Ser Pro Glu Ser Leu Asn Ala Cys Leu Ala Arg 

115 120 125 

Leu Thr Thr Leu Glu Gin Thr Leu Gin Leu Ser Asp Glu Glu Leu Lys 
10 130 135 140 

Gin Leu Leu Ala His Asn Ser Ser Thr Gin Leu Asp Glu Glu Leu Arg 
145 150 155 160 

Arg Leu Thr Lys Ala Met His Asn Leu Arg Lys Cys Met Glu Thr Leu 
165 170 175 

15 Asp Ser Ser Gly Ala Val Ala Ser Asn Val Asp Pro Glu Gin Trp His 

180 185 190 

Trp Asp Ser Trp Asp Arg Pro His Pro His His Met His Arg Gly Ser 

195 200 205 

He Gly Asn He Gly Leu Gly Leu Ser Ser Ala Ser Pro Arg Ala His 
20 210 215 220 

His Arg Gin His Gin His Gin His Ala Asn Ser Lys Pro Lys He Val 
225 230 235 240 

Asn Asn Ser Ala Ser Ser Ser Arg Ser Glu Gin Gin Pro Leu Thr Gly 
245 250 255 

25 Ser Gin Leu Thr Leu Thr Leu Thr Pro Ser Pro Pro Asn Ser Pro Phe 

260 265 270 

Thr Pro Ala Ser Gly Thr Ala Ser Ala Ser Gly Thr Pro Gin Arg Ser 

275 280 285 

Arg Ser Thr Thr Thr Ala Ala Gly Thr Pro Pro Pro Ala Lys Lys His 
30 290 295 300 

Gin Thr Leu Leu Met His Asn Ser Ser Ala Ser Glu Thr Ala Leu Ala 
305 310 315 320 

Glu Gin Pro Pro Arg Pro Pro Arg Ser Arg Leu Pro Thr Asp Pro Ser 
325 330 335 

35 Pro Asp Ser His Ser Ser Ala Ser Ser Ser Asp He Phe Val Asp Gly 

340 345 350 

Gly Ser He Asn Ser Ser Asn Val Leu Leu Val Pro Pro Ser Pro Gly 

355 360 365 

Val Ala His Val Gly Met Gly His Thr lie Lys His Arg Phe Ser Lys 
40 370 375 380 

Trp Phe Gly Phe Met Ala Thr Cys Lys leu Cys Gin Lys Gin Met Met 
385 390 395 400 

Ser His Trp Phe Lys Cys Thr Asp Cys Lys Tyr He Cys His Lys Ser 

405 410 415 

Cys Ala Pro His Val Pro Pro Ser Cys Gly Leu Pro Pro Glu Tyr Val 



45 
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420 425 430 

His Glu Phe Arg Gin Thr Gin Val . Gly. Gly Arg Trp Asp Pro Ala Gin 

435 440 445 

His Ser Ser Ser Lys Ala Ser Pro Val Pro Arg Lys Ser Thr Leu Gly 
450 455 460 

5 Lys Pro Gin Leu Gin Gin Pro Gin Leu Gin His Gly Asp Ser Ser Ser 

465 470 475 480 

Pro Ser Ser Ser Cys Thr Ser Ser Thr Pro Ser Ser Pro Ala Leu Phe 

485 490 495 

Gin Gin Gin Gin Leu Gin Leu Ala Thr Pro Ser Ala Cys Gin Pro Lys 
JO 500 505 510 

Pro Ala Pro Ala Ala Val Ala Ala Ala Ala Thr Gin Gin Gly Gin Gin 

515 520 525 

Ser Gin Phe Asn Phe Pro Asn Val Thr He Thr Ser He Asn Ala Cys 
530 535 540 

15 Asn Ser Asn Ala Ser Ala Ala Gin Thr Leu He Ser Asn Glu Pro Gin 

545 550 555 560 

Ala His Met Ala Thr Thr Glu Ser Thr Leu Thr Asn Gly Asn Asn Asn 

565 570 575 

Ser Ser Ser Asn Asn Gly Ser Ser Ala Asn Asn Asn Ser Ser Ser Ser 
20 580 585 590 

Ser Ser Cys Ser Asn Gly His Leu His Ser Leu Thr Gly Ser Gin Val 

595 600 605 

Ser Thr His Ser Ala Thr Ser Gin Val Ser Asn Val Ser Gly Ser Ser 
610 615 620 

25 Ser Ala Thr Tyr Thr Ser Ser Leu Val Asn Ser Gly Ser Phe Phe Pro 

625 630 635 640 

Arg Lys Leu Ser Asn Ala Gly Val Asp Lys Arg Val Pro Phe Thr Ser 

645 650 655 

Glu Tyr Thr Asp Thr His Lys Ser Asn Asp Ser Asp Lys Thr Val Ser 
30 660 665 670 

Leu Ser Gly Ser Ala Ser Thr Asp Ser Asp Arg Thr Pro Val Arg Leu 

675 680 685 

Asp Ser Thr Glu Asp Gly Asp Ser Gly Gin Trp Arg Gin Asn Ser lie 
690 695 700 

35 Ser Leu Lys Glu Trp Asp He Pro Tyr Gly Asp Leu His Leu Leu Glu 

705 710 715 720 

Arg He Gly Gin Gly Arg Phe Gly Thr Val His Arg Ala Leu Trp His 

725 730 735 

Gly Asp Val Ala Val Lys Leu Leu Asn Glu Asp Tyr Leu Gin Asp Glu 
40 740 745 750 

His Met Leu Glu Ser Phe Arg Asn Glu Val Ala Asn Phe Lys Lys Thr 

755 760 765 

Arg His Glu Asn Leu Val Leu Phe Met Gly Ala Cys Met Asn Pro Pro 
770 775 7B0 

45 Tyr Leu Ala He Val Thr Ala Leu Cys Lys Gly Asn Thr Leu Tyr Thr 
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785 7 *° 79S 800 

Tyr He His Gin Arg Arg Glu Lys Phe Ala Met Asn Arg Thr Leu Leu 

8 °5 810 815 

lie Ala Gin Gin He Ala Gin Gly Met Gly Tyr Leu His Ala Arg Asp 

820 825 830 

He He His Lys Asp Leu Arg Thr Lys Asn He Phe He Glu Asn Gly 

835 840 8 45 

Lys Val He He Thr Asp Phe Gly Leu Phe Ser Ser Thr Lys Leu Leu 

850 855 860 

Tyr cys Asp Met Gly Leu Gly Val Pro Gin Asn Trp Leu Cys Tyr Leu 
865 870 875 880 

Ala Pro Glu Leu He Arg Ala Leu Gin Pro Cys Lys Pro Pro Gly Glu 

885 890 895 

Cys Leu Glu Phe Thr Ser Tyr Ser Asp Val Tyr Ser Phe Gly Thr Val 

900 905 910 

Trp Tyr Glu Leu He Cys Gly Glu Phe Thr Phe Lys Asp Gin Pro Ala 

915 920 925 

Glu Ser He He Trp Gin Val Gly Arg Gly Met Lys Gin Ser Leu Ala 

930 935 940 

Asn Leu Gin Ser Gly Arg Asp Val Lys Asp Leu Leu Met Leu Cys Trp 
945 9S0 955 960 

Thr Tyr Glu Lys Glu His Arg Pro Asp Phe Ala Arg Leu Leu Ser Leu 

965 970 975 

Leu Glu His Leu Pro Lys Lys Arg Leu Ala Arg Ser Pro Ser His Pro 

980 985 g90 

Val Asn Leu Ser Arg Ser Ala Glu Ser Val Phe 
995 1000 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A> LENGTH: 4094 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GAATTCCCTC GGGGCTTTCC TGCCGAGGCG CCCGTGTCCC CGGGCTCCTC GCCTCGGCCC 60 
CCAGCGGCCC CGATGCCGAG GCATGGATAG AGCGGCGTTG CGCGCGGCAG CGATGGGCGA 120 
GAAAAAGGAG GGCGGCGGCG GGGGCGCCGC GGCGGACGGG GGCGCAGGGG CCGCCGTCAG 180 
CCGGGCGCTG CAGCAGTGCG GCCAGCTGCA GAAGCTCATC GATATCTCCA TCGGCAGTCT 240 
GCGCGGGCTG CGCACCAAGT GCTCAGTGTC TAACGACCTC ACACAGCAGG AGATCCGGAC 300 
CCTAGAGGCA AAGCTGGTGA AATACATTTG CAAGCAGCAG CAGAGCAAGC TTAGTGTGAC 360 
CCCAAGCGAC AGGACCGCCG AGCTCAACAG CTACCCACGC TTCAGTGACT GGCTGTACAT 420 
CTTCAACGTG AGGCCTGAGG TGGTGCAGGA GATCCCCCAA GAGCTCACAC TGGATGCTCT 480 
GCTGGAGATG GACGAGGCCA AAGCCAAGGA GATGCTGCGG CGCTGGGGGG CCAGCACGGA 540 
GGAGTGCAGC CGCCTACAGC AAGCCCTTAC CTGCCTTCGG AAGGTGACTG GCCTGGGAGG 600 
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GGAGCACAAA 
TCCCATGGAC 
CATCTCCGTG 
CTCGGACTCC 
CTTCATCACG 
GACACCCCCA 
ACGGAGCAAG 
GTTTGAACTC 
GCACAGGTTC 
GATTTTTGGC 
AGCTCCCGCC 
CCCGTCAGAT 
CAAGGCCCTG 
ATCCTCCACC 
CTCCAGTGCC 
CCCAGACATT 
GCTCGACGAC 
TGAGGCTGGC 
CTCCCGCCGG 
GCAAGAGTGG 
CTGGGGCCGG 
GGACGGCCAC 
GACGCGGCAT 
CATTATCACC 
GTCTCTGGAC 
TCTTCATGCA 
CGGCAAAGTG 
ACGGCGCGAG 
CGTACGAGAA 
TGTCTATGCA 
CCAGCCTGCT 
GGCATCCGTC 
TCTGCAGGAG 
GAACCGGCGG 
AGTCATGCCC 
GTAGCCAGCC 
CATGCAGACC 
GTGTCTCCTC 
CCAAGGTGTG 
GCCAGTGTTT 
CATTCTGCAG 
GAAGACCCTA 
TGTGGGCTGT 
CGAGAACCAC 
GAGTTCCTGG 
TTCCCCCTTA 



ATGGACTCAG 
ATGCTTTCCT 
TCCGCCCTGC 
TGTATCCCCT 
CCCCCTACCA 
CCGCCAAGCC 
TCCCACGAGT 
CCTCATGGAT 
TCCACAAAGT 
GTGAAGTGCA 
TGCAGGATCA 
ATCAACAACC 
ACAAAGAAGG 
ACGTCCTCCA 
ACCACGCCTC 
TCAGCCTGTT 
CAGCCCAAAA 
AAGTCAGAGG 
CCCTGGAGGG 
GACATCCCCT 
GTGCACCGAG 
AATCAGGACC 
GAGAACGTGG 
AGCTTCTGCA 
ATCAATAAGA 
AAAGGCATCG 
GTCATCACAG 
AACCAACTGA 
ATGATCCCGG 
TTCGGGACTG 
GAGGCCTTGA 
AGCCTGGGGA 
AGACCCAGCT 
CTCTCCCACC 
CGCTTTGAAA 
CTGCACGTTC 
ACCACCTCAA 
CCTAAAGGAC 
TGGAGCTCAG 
ACACAGAGGT 
AAGGGTGCTG 
CAGCTGTGAG 
ACCCGGAAAA 
ACTAAGGAGC 
AGAGTGGACT 
AAAAAAAAAA 



GTTGGAGTTC 
CGCTGGGCAG 
CTGCCTCAGA 
TGCACACCAG 
CACCCCAGCT 
GCAAGGTCTT 
CCCAGCTGGG 
CCCCACAGCT 
CATGGTTGTC 
AACACTGCAG 
CCTTCCTCCC 
CAGTGGACAG 
AGCACCCTCC 
CACCCTCATC 
CCAACCCGTC 
CTCAGGCAGC 
CAGATGTGCT 
CAGAGGATGA 
GCCCCATCTC 
TTGAACAGGT 
GCCGTTGGCA 
ACCTGAAGCT 
TGCTCTTCAT 
AGGGGCGGAC 
CTAGGCAGAT 
TGCACAAGGA 
ACTTCGGGCT 
AACTGTCACA 
GGCGGGACGA 
TGTGGTATGA 
TCTGGCAGAT 
AGGAAGTCGG 
TCAGCCTGCT 
CTGGGCACTT 
GGTTTGGCCT 
ATGCAGAGAG 
GGAATCAGAA 
GTGCGTGCGT 
GATCGCAGCC 
TTCTGCCTGG 
GCACAGTGGA 
AGGCCCAGGG 
GGGCAGGTGG 
AGCAGCCTGA 
CAGTTTCTGC 
AGTACAGACA 



AACAGATGCT 
AGCGGGTGCC 
CTCTCCGGTC 
CGGCCGGCTG 
ACGACGGCAC 
CCAGCTGCTC 
AAACCGAATC 
GGTACGAAGG 
ACAGGTGTGC 
GTTAAAATGC 
ACTGGCCAGG 
AGCAGCAGAG 
AGCCATGAAC 
GCCGGCACCT 
ACCTGGCCAG 
CCCGCTGTCC 
AGGTGTTCAC 
CGAGGAGGAT 
TCGAAAGGCC 
GGAACTGGGC 
TGGCGAGGTG 
GTTCAAGAAA 
GGGGGCCTGC 
ATTGCATTCA 
CGCCCAGGAG 
CCTCAAGTCC 
GTTTGGGATC 
TGACTGGCTG 
GGACCAGCTG 
ACTACAGGCA 
TGGAAGTGGG 
CGAGATCCTC 
GATGGACATG 
TTGGAAGTCG 
GGGGACCCTG 
TGTCTTCCTT 
GCATTGCATC 
GCGTGCGTGC 
ATACACGCAA 
CAAGCTTGGT 
GCAGCACGGA 
TTGAGCCAGA 
CAGGAGGTTT 
GTTAGGAATC 
TCTGATCCAG 
GAATCTCAGC 



CGAGACAGTA 
AGCACTCAGG 
CCCGGCCTCA 
ACCCCCCGGG 
GCCAAGCTGA 
CCCAGCTTCC 
GACGACGTCA 
GATATCGGGC 
AACGTGTGCC 
CATAACAAGT 
CTTCGGAGGA 
CCCCATTTTG 
CTGGACTCCA 
TTCCTGACCT 
CGGGACAGCA 
AGCACAGCCG 
GAAGCAGAGG 
GAGGTGGACG 
AGCCAGACCA 
GAGCCCATTG 
GCCATTCGGC 
GAGGTGATGA 
ATGAACCCAC 
TTCGTGAGGG 
ATCATCAAGG 
AAGAATGTCT 
TCGGGTGTCG 
TGCTACCTGG 
CCCTTCTCCA 
AGAGACTGGC 
GAAGGAGTAC 
TCTGCCTGCT 
CTGGAGAGGC 
GCTGACATTA 
GAGTCCGGTA 
TCGAAAACAT 
CCAAGCTGCG 
GTGCGTGCGT 
CTCCAGATGA 
ATTTTACAGT 
TGTCCCCAGC 
TGAAAGAAAA 
GCCTTGGCCT 
TATCTGGATT 
GCCTGTTGTG 
GGCTTCTAGA 



GCTTGGGGCC 
GACCCCGTTC 
GTGAGGGCCT 
CCCTGCACAG 
AGCCACCAAG 
CCACACTCAC 
CCCCGATGAA 
TCTCGGTGAC 
AGAAGAGCAT 
GCACAAAGGA 
CAGAGTCTGT 
GAACCCTTCC 
GCAGCAACCC 
CATCTAATCC 
GGTTCAGCTT 
ACAGTACACG 
CTGAGGAGCC 
ACCTCCCCAG 
GCGTTTACCT 
GACAGGGTCG 
TGCTGGAGAT 
ACTACCGGCA 
CTCACCTGGC 
ACCCCAAGAC 
GCATGGGTTA 
TCTATGACAA 
TCCGAGAGGA 
CCCCCGAGAT 
AAGCAGCCGA 
CCTTTAAGCA 
GGCGCGTCCT 
GGGCTTTCGA 
TGCCCAAGCT 
ACAGCAGCAA 
ATCCAAAGAT 
GATCACGAAA 
GACTGGGAGC 
GCGTGCGTCA 
TACCACTACC 
AGGTGAAGAT 
CCCCGTTCTG 
GCTGCGTGGG 
GTGCTTGGGC 
ACGGGGATCA 
CTTTTTTTTT 
CTGATCTGAT 



660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
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\- X \J\-\3\j\3\^t\j\j 


GAGGGGGGGh 


GGGATAGCt-A 


CATATCTGTG 


3420 






x oiivjK3L.v. x v_ L. 


»pr«m» r»rv"»Tv r» 
AGb x rkGuCAL 


IV IV IV P^PTPTP 

AAAGGCTGTG 


GAACxCAGCC 


J 4oU 








LAI AUALLLt, 


LT TCCCCCAG 


AGCGAAGGGC 




rrt » r*r*r*r' TV TTT 




v-rlv_i 1 AcxVj i I V. 


x 1 bu xGAAGG 


AGAACAGGGA 


CGTTGGCAGA 


■j r/vn 


AGCAGTTTGC 


At* I uuLLL 1 /i 


f * TV rri/ w i <cn TV Tv A 

V-»V-/i Xv, 1 


AC L u rLil GTC 


TGTCACACCA 


GAAGGTTCTA 


-jrcn 
iooU 


GACCTACCAC 


CACTTUCU x 1 


r*r-r*r* a 'Tv , T , r ,, ?v 

L.L.UV-A 1 L I LA 


TGGAAACCTT 


TTAGCCCATT 


CTGACCCCTG 


3720 


TGTG I GCTCT 


GAGd UAOA 1 




* r^r^cc^fT* jv 
vjACuuCCCAG 


GCACATCAGT 


CAGGGAGGCT 


3780 


CTGATGTGAG 


CCGCAGACCT 


CTGTGTTCAT 


TCCTATGAGC 


TGGAGGGGCT 


GGACTGGGTG 


3840 


GGGTCAGATG 


TGCTTGGCAG 


GAACTGTCAG 


CTGCTGAGCA 


GGGTGGTCCC 


TGAGCGGAGG 


3900 


ATAAGCAGCA 


TCAGACTCCA 


CAACCAGAGG 


AAGAAAGAAA 


TGGGGATGGA 


GCGGAGACCC 


3960 


ACGGGCTGAG 


TCCCGCTGTG 


GAGTGGCCTT 


GCAGCTCCCT 


CTCAGTTAAA 


ACTCCCAGTA 


4020 


AAGCCACAGT 


TCTCCGAGCA 


CCCAAGTCTG 


CTCCAGCCGT 


CTCTTAAAAC 


AGGCCACTCT 


4060 


CTGAGAAGGA 


ATTC 










4094 



(2) INFORMATION FOR SEQ ID NO: 6: 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 873 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 
20 <ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asp Arg Ala Ala Leu Arg Ala Ala Ala Met Gly Glu Lys Lys Glu 
15 10 15 

Gly Gly Gly Gly Gly Ala Ala Ala Asp Gly Gly Ala Gly Ala Ala Val 
25 20 25 30 

Ser Arg Ala Leu Gin Gin Cys Gly Gin Leu Gin Lys Leu lie Asp lie 

35 40 45 

Ser He Gly Ser Leu Arg Gly Leu Arg Thr Lys Cys Ser Val Ser Asn 
50 55 60 

30 Asp Leu Thr Gin Gin Glu He Arg Thr Leu Glu Ala Lys Leu Val Lys 

65 70 75 80 

Tyr He Cys Lys Gin Gin Gin Ser Lys Leu Ser Val Thr Pro Ser Asp 

85 90 95 

Arg Thr Ala Glu Leu Asn Ser Tyr Pro Arg Phe Ser Asp Trp Leu Tyr 
35 100 105 110 

He Phe Asn Val Arg Pro Glu Val Val Gin Glu He Pro Gin Glu Leu 

115 120 125 

Thr Leu Asp Ala Leu Leu Glu Met Asp Glu Ala Lys Ala Lys Glu Met 
130 135 140 

40 Leu Arg Arg Trp Gly Ala Ser Thr Glu Glu Cys Ser Arg Leu Gin Gin 

145 150 155 160 

Ala Leu Thr Cys Leu Arg Lys Val Thr Gly Leu Gly Gly Glu His Lys 

165 170 175 

Met Asp Ser Gly Trp Ser Ser Thr Asp Ala Arg Asp Ser Ser Leu Gly 
45 180 185 190 
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Pro Pro Met Asp Met Leu Ser Ser Leu Gly Arg Ala Gly Ala Ser Thr 

195 200 205 

Gin Gly Pro Arg Ser He Ser Val Ser Ala Leu Pro Ala Ser Asp Ser 

210 215 220 

Pro Val Pro Gly Leu Ser Glu Gly Leu Ser Asp Ser Cys He Pro Leu 
225 230 235 240 

His Thr Ser Gly Arg Leu Thr Pro Arg Ala Leu His Ser Phe He Thr 

245 250 255 

Pro Pro Thr Thr Pro Gin Leu Arg Arg His Ala Lys Leu Lys Pro Pro 

260 265 270 

Arg Thr Pro Pro Pro Pro Ser Arg Lys Val Phe Gin Leu Leu Pro Ser 

275 280 285 

Phe Pro Thr Leu Thr Arg Ser Lys Ser His Glu Ser Gin Leu Gly Asn 

290 295 300 

Arg He Asp Asp Val Thr Pro Met Lys Phe Glu Leu Pro His Gly Ser 
305 310 3i 5 32Q 

Pro Gin Leu Val Arg Arg Asp He Gly Leu Ser Val Thr His Arg Phe 

325 330 335 

Ser Thr Lys Ser Trp Leu Ser Gin Val Cys Asn Val Cys Gin Lys Ser 

340 345 350 

Met He Phe Gly Val Lys Cys Lys His Cys Arg Leu Lys Cys His Asn 

355 360 365 

Lys Cys Thr Lys Glu Ala Pro Ala Cys Arg He Thr Phe Leu Pro Leu 

370 375 380 

Ala Arg Leu Arg Arg Thr Glu Ser Val Pro Ser Asp He Asn Asn Pro 
385 390 395 400 

Val Asp Arg Ala Ala Glu Pro His Phe Gly Thr Leu Pro Lys Ala Leu 

405 410 415 

Thr Lys Lys Glu His Pro Pro Ala Met Asn Leu Asp Ser Ser Ser Asn 

420 425 430 

Pro Ser Ser Thr Thr Ser Ser Thr Pro Ser Ser Pro Ala Pro Phe Leu 

435 440 445 

Thr Ser Ser Asn Pro Ser Ser Ala Thr Thr Pro Pro Asn Pro Ser Pro 

450 455 460 

Gly Gin Arg Asp Ser Arg Phe Ser Phe Pro Asp He Ser Ala Cys Ser 
465 470 475 480 

Gin Ala Ala Pro Leu Ser Ser Thr Ala Asp Ser Thr Arg Leu Asp Asp 

485 490 495 

Gin Pro Lys Thr Asp Val Leu Gly Val His Glu Ala Glu Ala Glu Glu 

500 505 510 

Pro Glu Ala Gly Lys Ser Glu Ala Glu Asp Asp Glu Glu Asp Glu Val 

515 520 525 

Asp Asp Leu Pro Ser Ser Arg Arg Pro Trp Arg Gly Pro He Ser Arg 

530 535 540 

Lys Ala Ser Gin Thr Ser Val Tyr Leu Gin Glu Trp Asp He Pro Phe 
545 550 555 560 
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Glu 



Gin 



Val 



Glu 



Leu 



565 



Gly 



Glu 



Pro 



He 



570 



Gly Gin 



Gly Arg 



Trp 



Gly 
575 




10 



15 



20 



25 



30 



35 



Val His Arg Gly Arg Trp His Gly Glu Val Ala He Arg Leu Leu Glu 

580 585 590 

Met Asp Gly His Asn Gin Asp His Leu Lys Leu Phe Lys Lys Glu Val 

595 600 605 

Met Asn Tyr Arg Gin Thr Arg His Glu Asn Val Val Leu Phe Met Gly 

610 615 620 

Ala Cys Met Asn Pro Pro His Leu Ala He He Thr Ser Phe Cys Lys 
625 630 635 640 

Gly Arg Thr Leu His Ser Phe Val Arg Asp Pro Lys Thr Ser Leu Asp 

645 650 655 

He Asn Lys Thr Arg Gin He Ala Gin Glu He He Lys Gly Met Gly 

660 665 670 

Tyr Leu His Ala Lys Gly He Val His Lys Asp Leu Lys Ser Lys Asn 

675 680 685 

Val Phe Tyr Asp Asn Gly Lys Val Val He Thr Asp Phe Gly Leu Phe 

690 695 700 

Gly He Ser Gly Val Val Arg Glu Glu Arg Arg Glu Asn Gin Leu Lys 
705 710 715 720 

Leu Ser His Asp Trp Leu Cys Tyr Leu Ala Pro Glu He Val Arg Glu 

725 730 735 

Met He Pro Gly Arg Asp Glu Asp Gin Leu Pro Phe Ser Lys Ala Ala 

740 745 750 

Asp Val Tyr Ala Phe Gly Thr Val Trp Tyr Glu Leu Gin Ala Arg Asp 

755 760 765 

Trp Pro Phe Lys His Gin Pro Ala Glu Ala Leu He Trp Gin He Gly 

770 775 780 

Ser Gly Glu Gly Val Arg Arg Val Leu Ala Ser Val Ser Leu Gly Lys 
785 790 795 800 

Glu Val Gly Glu He Leu Ser Ala Cys Trp Ala Phe Asp Leu Gin Glu 

805 810 815 

Arg Pro Ser Phe Ser Leu Leu Met Asp Met Leu Glu Arg Leu Pro Lys 

820 825 830 

Leu Asn Arg Arg Leu Ser His Pro Gly His Phe Trp Lys Ser Ala Asp 

835 840 845 

He Asn Ser Ser Lys Val Met Pro Arg Phe Glu Arg Phe Gly Leu Gly 

850 855 860 

Thr Leu Glu Ser Gly Asn Pro Lys Met 
865 870 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2846 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY; linear 
(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 





AGAGCAGCGC 


TGCGCTCGGC 


CGCGTTGGGA 


GAGAAGAAGG 


AGGGCGGTGG 


CGGGGGTGAC 


60 




GCGGCTATCG 


CGGAGGGAGG 


TGCAGGGGCC 


GCGGCCAGCC 


GGACACTGCA 


GCAGTGCGGG 


120 


5 


CAGCTGCAGA 


AGCTCATCGA 


CATCTCCATC 


GGCAGCCTGC 


GCGGGCTGCG 


CACCAAGTGC 


180 




GTGGTGTCCA 


ACGACCTCAC 

4»Vp> V*J»V^ V* A w*»w> 


CCAGCAGGAG 


ATACGGACCC 


TGGAGGCGAA 


GCTGGTCCGT 


240 




TACATTTGTA 


AGCAGAGGCA 


GTGCAAGCTG 


AGCGTGGCTC 


CCGGTGAGAG 


GACCCCAGAG 


300 




CTCAACAGCT 


ACCCCCGCTT 

* »WW W\« W\*w X X 


CAGCGACTGG 


CTGTACACTT 


TCAACGTGAG 


GCCGGAGGTG 


360 




GTGCAGGAGA 


TCCCCCGAGA 


CCTCACGCTG 


GATGCCCTGC 


TGGAGATGAA 


TGAGGCCAAG 


420 


10 


GTGAAGGAGA 


CGCTGCGGCG 


CTGTGGGGCC 


AGCGGGGATG 


AGTGTGGCCG 


TCTGCAGTAT 


480 




GCCCTCACCT 


GCCTGCGGAA 

WWW X Vn#Wv*m 


GGTGACAGGC 


CTGGGAGGGG 


AGCACAAGGA 


GGACTCCAGT 


540 




TGGAGTTCAT 


TGGATGCGCG 


GCGGGAAAGT 


GGCTCAGGGC 


CTTCCACGGA 


CACCCTCTCA 


600 




GCAGCCAGCC 


TGCCCTGGCC 


CCCAGGGAGC 


TCCCAGCTGG 


GCAGAGCAGG 


CAACAGCGCC 


660 




CAGGGCCCAC 


V7w X WWXVA W X W 


CGTGTCAGCT 


CTTCCCGCCT 


CAGACTCCCC 


CACCCCCAGC 


720 


15 


TTCAGTGAGG 


v)V\. A WflK3/\ 


fACCTGTATT 


CCCCTGCACG 


CCAGCGGCCG 


GCTGACCCCC 


780 






A wrl A 


PACCCCGCCC 


ACCACACCCC 


AGCTGCGACG 


GCACACCAAG 


840 






v.^wvJvxrkwvjl_l- 




AGCCGCAAGG 


TCTTCCAGCT 


GCTGCCCAGC 


900 








rAAGTCCCAT 


GAGTCTCAGC 


TGGGGAACCG 

A wwwWinv W w 


CATTGATGAC 


960 






X unuu 1 X iun 


aa» x w a vuvn a 


GGATCCCCAC 


AGATGGTACG 


GAGGGATATC 


1020 




wwV3\» A W A V— .VJV3 






AAGTCCTGGC 


TGTCGCAGGT 

A \J A vvWiwW A 


CTGCCACGTG 

W; A WWWIUVJ A \J 


1080 




TGCCAGAAGA 




TTX*AGTGAAG 


TGCAAGCATT 


GCAGGTTGAA 


GTGTCACAAC 


1140 




AAATGTACCA 

****** a vj a nuuA 


AAflAAG^ , ^T ,, ^ , 


TGCCTGTAGA 

A Uvw x vj x nun 


ATATCCTTCC 

"A*** WW A X WW 


TGCCACTAAC 


TCGGCTTCGG 

A WUVw A X V«\JV7 


1200 




AGGACAGAAT 


V— X UAwWvU IV 


GGACATCAAC 


AACCCGGTGG 


ACAGAGCAGC 


CGAACCCCAT 


1260 




TTTGGAACCP 


TPPPfAAARr 1 


ACTGACAAAG 


AAGGAGCACC 


CTCCGGCCAT 

\»» A w wllWWI A 


GAATCATCTG 


1320 


25 


GACTCCAGCA 






TCCTCCACAC 

X WW X vWIIMMi 


cctcctcacc 


wVaWVJWWW X X w 


1380 




vuuaua x w\i 


wwAVrlV-wv~ rv AV 


UlUMAf wxlw w 


AW UV w W WULA 


ACCCCTCACC 

XIWWWW A vfU> W 


TGGCCAf5CGG 

X VJU\.WtvJv\JU 


1440 




GACAGCAGGT 


TCAACTTCCC 




'i""i*f*An u I^ "JVPC 

a x wti xwnxw 


ATAGACAGCA 


fyTTTATf^TTT* 

VJ* X Xf&XW X X A 


1500 




CCAGACATTT 


^ , AR^*^*T*^ , w^^ , 

^^ivv w a x a v_» v_- 


ACACGCAGCC 


CCGCTCCCTG 

w V— ww, A www A VJ 


AAGCTGCCGA 


CGGTACCCGG 

WWW A nVVVWV 


1560 




CTCGATGACC 

A w va«V X \3>ilW\j 


AGCCGAAAGC 


AGATGTGTTG 


GAAGCTCACG 


AAGCGGAGGC 


TGAGGAGCCA 


1620 


30 


GAGGCTGGCA 


AGTCAGAGGC 


AGAAGACGAT 


GAGGACGAGG 


TGGACGACTT 


GCCGAGCTCT 


1680 




CGCCGGCCCT 


GGCGGGGCCC 


CATCTCTCGC 


AAGGCCAGCC 


AGACCAGCGT 


GTACCTGCAG 

v* A nvv A wWlw 


1740 




GAGTGGGACA 


TCCCCTTCGA 


GCAGGTAGAG 


CTGGGCGAGC 


CCATCGGGCA 


GGGCCGCTGG 


1800 




GGCCGGGTGC 


ACCGCGGCCG 


CTGGCATGGC 


GAGGTGGCCA 


TTCGCCTGCT 


GGAGATGGAC 


1860 




GGCCACAACC 


AGGACCACCT 


GAAGCTCTTC 


AAGAAAGAGG 


TGATGAACTA 


CCGGCAGACG 


1920 


35 


CGGCATGAGA 


ACGTGGTGCT 


CTTCATGGGG 


GCCTGCATGA 


ACCCGCCCCA 


CCTGGCCATT 


1980 




ATCACCAGCT 


TCTGCAAGGG 


GCGGACGTTG 


CACTCGTTTG 


TGAGGGACCC 


CAAGACGTCT 


2040 




CTGGACATCA 


ACAAGACGAG 


GCAAATCGCT 


CAGGAGATCA 


TCAAGGGCAT 


GGGATATCTT 


2100 




CATGCCAAGG 


GCATCGTACA 


CAAAGATCTC 


AAATCTAAGA 


ACGTCTTCTA 


TGACAACGGC 


2160 




AAGGTGGTCA 


TCACAGACTT 


CGGGCTGTTT 


GGGATCTCAG 


GCGTGGTCCG 


AGAGGGACGG 


2220 


40 


CGTGAGAACC 


AGCTAAAGCT 


GTCCCACGAC 


TGGCTGTGCT 


ATCTGGCCCC 


TGAGATTGTA 


2280 




CGCGAGATGA 


CCCCCGGGAA 


GGACGAGGAT 


CAGCTGCCAT 


TCTCCAAAGC 


TGCTGATGTC 


2340 




TATGCATTTG 


GGACTGTTTG 


GTATGAGCTG 


CAAGCAAGAG 


ACTGGCCCTT 


GAAGAACCAG 


2400 




GCTGCAGAGG 


CATCCATCTG 


GCAGATTGGA 


AGCGGGGAAG 


GAATGAAGCG 


TGTCCTGACT 


2460 




TCTGTCAGCT 


TGGGGAAGGA 


AGTCAGTGAG 


ATCCTGTCGG 


CCTGCTGGGC 


TTTCGACCTG 


2520 


45 


CAGGAGAGAC 


CCAGC TTCAG 


CCTGCTGATG 


GACATGCTGG 


AGAAACTTCC 


CAAGCTGAAC 


2580 
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CGGCGGCTCT CCCACCCTGG ACACTTCTGG AAGTCAGCTG AGTTGTAGGC CTGGCTGCCT 2 640 
TGCATGCACC AGGGGCTTTC TTCCTCCTAA TCAACAACTC AGCACCGTGA CTTCTGCTAA 2700 
AATGCAAAAT GAGATGCGGG CACTAACCCA GGGGATGCCA CCTCTGCTGC TCCAGTCGTC 2760 
TCTCTCGAGG CTACTTCTTT TGCTTTGTTT TAAAAACTGG CCCTCTGCCC TCTCCACGTG 2820 
GCCTGCATAT GCCCAAGCCG GAATTC 2846 

5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 875 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 
(ii) MOLECULE TYPE: peptide 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Arg Ala Ala Leu Arg Ser Ala Ala Leu Gly Glu Lys Lys Glu Gly Gly 
15 1 5 10 15 

Gly Gly Gly Asp Ala Ala lie Ala Glu Gly Gly Ala Gly Ala Ala Ala 

20 25 30 

Ser Arg Thr Leu Gin Gin Cys Gly Gin Leu Gin Lys Leu lie Asp He 
35 40 45 

20 Ser lie Gly Ser Leu Arg Gly Leu Arg Thr Lys Cys Val Val Ser Asn 

50 55 60 

Asp Leu Thr Gin Gin Glu He Arg Thr Leu Glu Ala Lys Leu Val Arg 
65 70 75 80 

Tyr He Cys Lys Gin Arg Gin Cys Lys Leu Ser Val Ala Pro Gly Glu 
25 85 90 95 

Arg Thr Pro Glu Leu Asn Ser Tyr Pro Arg Phe Ser Asp Trp Leu Tyr 

100 105 HO 

Thr Phe Asn Val Arg Pro Glu Val Val Gin Glu He Pro Arg Asp Leu 
115 120 125 

30 Thr Leu Asp Ala Leu Leu Glu Met Asn Glu Ala Lys Val Lys Glu Thr 

130 135 140 

Leu Arg Arg Cys Gly Ala Ser Gly Asp Glu Cys Gly Arg Leu Gin Tyr 
145 150 155 160 

Ala Leu Thr Cys Leu Arg Lys Val Thr Gly Leu Gly Gly Glu His Lys 
35 165 170 175 

Glu Asp Ser Ser Trp Ser Ser Leu Asp Ala Arg Arg Glu Ser Gly Ser 

180 185 190 

Gly Pro Ser Thr Asp Thr Leu Ser Ala Ala Ser Leu Pro Trp Pro Pro 
195 200 205 

40 Gly Ser Ser Gin Leu Gly Arg Ala Gly Asn Ser Ala Gin Gly Pro Arg 

210 215 220 

Ser He Ser Val Ser Ala Leu Pro Ala Ser Asp Ser Pro Thr Pro Ser 
225 230 235 240 

Phe Ser Glu Gly Leu Ser Asp Thr Cys He Pro Leu His Ala Ser Gly 
45 245 250 255 
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Arg Leu Thr Pro Arg Ala Leu His Ser Phe lie Thr Pro Pro Thr Thr 

260 265., .270 

Pro Gin Leu Arg Arg His Thr Lys Leu Lys Pro Pro Arg Thr Pro Pro 

275 280 285 

Pro Pro Ser Arg Lys Val Phe Gin Leu Leu Pro Ser Phe Pro Thr Leu 

290 295 300 

Thr Arg Ser Lys Ser His Glu Ser Gin Leu Gly Asn Arg lie Asp Asp 
305 310 315 320 

Val Ser Ser Met Arg Phe Asp Leu Ser His Gly Ser Pro Gin Met Val 

325 330 335 

Arg Arg Asp lie Gly Leu Ser Val Thr His Arg Phe Ser Thr Lys Ser 

340 345 350 

Trp Leu Ser Gin Val Cys His Val Cys Gin Lys Ser Met lie Phe Gly 

355 360 365 

Val Lys Cys Lys His Cys Arg Leu Lys Cys His Asn Lys Cys Thr Lys 

370 375 380 

Glu Ala Pro Ala Cys Arg lie Ser Phe Leu Pro Leu Thr Arg Leu Arg 
385 390 395 400 

Arg Thr Glu Ser Val Pro Ser Asp He Asn Asn Pro Val Asp Arg Ala 

405 410 415 

Ala Glu Pro His Phe Gly Thr Leu Pro Lys Ala Leu Thr Lys Lys Glu 

420 425 430 

His Pro Pro Ala Met Asn His Leu Asp Ser Ser Ser Asn Pro Ser Ser 

435 440 445 

Thr Thr Ser Ser Thr Pro Ser Ser Pro Ala Pro Phe Pro Thr Ser Ser 

450 455 460 

Asn Pro Ser Ser Ala Thr Thr Pro Pro Asn Pro Ser Pro Gly Gin Arg 
465 470 475 480 

Asp Ser Arg Phe Asn Phe Pro Ala Ala Tyr Phe lie His His Arg Gin 

485 490 495 

Gin Phe He Phe Pro Asp lie Ser Ala Phe Ala His Ala Ala Pro Leu 

500 505 510 

Pro Glu Ala Ala Asp Gly Thr Arg Leu Asp Asp Gin Pro Lys Ala Asp 

515 520 525 

Val Leu Glu Ala His Glu Ala Glu Ala Glu Glu Pro Glu Ala Gly Lys 

530 535 540 

Ser Glu Ala Glu Asp Asp Glu Asp Glu Val Asp Asp Leu Pro Ser Ser 
545 550 555 560 

Arg Arg Pro Trp Arg Gly Pro He Ser Arg Lys Ala Ser Gin Thr Ser 

565 570 575 

Val Tyr Leu Gin Glu Trp Asp He Pro Phe Glu Gin Val Glu Leu Gly 

580 585 590 

Glu Pro He Gly Gin Gly Arg Trp Gly Arg Val His Arg Gly Arg Trp 

595 600 605 

His Gly Glu Val Ala He Arg Leu Leu Glu Met Asp Gly His Asn Gin 
610 615 620 
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Asp His Leu Lys Leu Phe Lys Lys Glu Val Met Asn Tyr Arg Gin Thr 
625 630 635 640 

Arg His Glu Asn Val Val Leu Phe Met Gly Ala Cys Met Asn Pro Pro 

645 650 655 

His Leu Ala He He Thr Ser Phe Cys Lys Gly Arg Thr Leu His Ser 
5 660 665 670 

Phe Val Arg Asp Pro Lys Thr Ser Leu Asp He Asn Lys Thr Arg Gin 

675 680 685 

He Ala Gin Glu He He Lys Gly Met Gly Tyr Leu His Ala Lys Gly 
690 695 700 

10 He Val His Lys Asp Leu Lys Ser Lys Asn Val Phe Tyr Asp Asn Gly 

705 710 715 720 

Lys Val Val He Thr Asp Phe Gly Leu Phe Gly He Ser Gly Val Val 

725 730 735 

Arg Glu Gly Arg Arg Glu Asn Gin Leu Lys Leu Ser His Asp Trp Leu 
15 740 745 750 

Cys Tyr Leu Ala Pro Glu He Val Arg Glu Met Thr Pro Gly Lys Asp 

755 760 765 

Glu Asp Gin Leu Pro Phe Ser Lys Ala Ala Asp Val Tyr Ala Phe Gly 
770 775 780 

20 Thr Val Trp Tyr Glu Leu Gin Ala Arg Asp Trp Pro Leu Lys Asn Gin 

785 790 795 800 

Ala Ala Glu Ala Ser He Trp Gin He Gly Ser Gly Glu Gly Met Lys 

805 810 815 

Arg Val Leu Thr Ser Val Ser Leu Gly Lys Glu Val Ser Glu He Leu 
25 820 825 830 

Ser Ala Cys Trp Ala Phe Asp Leu Gin Glu Arg Pro Ser Phe Ser Leu 

835 840 845 

Leu Met Asp Met Leu Glu Lys Leu Pro Lys Leu Asn Arg Arg Leu Ser 
850 855 860 

30 His Pro Gly His Phe Trp Lys Ser Ala Glu Leu 

865 870 875 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 
35 <A) LENGTH: 2126 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GAATTCCGGC ACACATCAGC ACTCACACAG CACACAGCAC ACACACAGCA CACATCAGCG 60 
CACACACAGC ACAGCTTCAT CACCCCGCCC ACCACACCCC AGCTGCGACG GCACACCAAG 120 
CTGAAGCCAC CACGGACGCC CCCCCCACCC AGCCGCAAGG TCTTCCAGCT GCTGCCCAGC 180 
TTCCCCACAC TCACCCGGAG CAAGTCCCAT GAGTCTCAGC TGGGGAACCG CATTGATGAC 240 
45 GTCTCCTCGA TGAGGTTTGA TCTCTCGCAT GGATCCCCAC AGATGGTACG GAGGGATATC 3 00 
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GGGCTGTCGG 


TGACGCACAG 


GTTCTCCACC 


AAGTCCTGGC TGTCGCAGGT 


CTGCCACGTG 


360 




TGCCAGAAGA 


GCATGATATT 


TGGAGTGAAG TGCAAGCATT GCAGGTTGAA GTGTCACAAC 


420 




AAATGTACCA 


AAGAAGCCCC 


TGCCTGTAGA 


ATATCCTTCC TGCCACTAAC TCGGCTTCGG 


480 




AGGACAGAAT 


CTGTCCCCTC 


GGACATCAAC 


AACCCGGTGG ACAGAGCAGC 


CGAACCCCAT 


540 




TTTGGAACCC 


TCCCCAAAGC 


ACTGACAAAG 


AAGGAGCACC CTCCGGCCAT 


GAATCACCTG 


600 


5 


GACTCCAGCA 


GCAACCCTTC 


CTCCACCACC 


TCCTCCACAC CCTCCTCACC 


GGCGCCCTTC 


660 




CCGACATCAT 


CCAACCCATC 


CAGCGCCACC 


ACGCCCCCCA ACCCCTCACC 


TGGCCAGCGG 


720 




GACAGCAGGT 


TCAACTTCCC 


AGCTGCCTAC 


TTCATTCATC ATAGACAGCA 


GTTTATCTTT 


780 




CCAGACATTT 


CAGCCTTTGC 


ACACGCAGCC 


CCGCTCCCTG AAGCTGCCGA 


CGGTACCCGG 


840 




CTCGATGACC 


AGCCGAAAGC 


AGATGTGTTG 


GAAGCTCACG AAGCGGAGGC 


TGAGGAGCCA 


900 


10 


GAGGCTGGCA 


AGTCAGAGGC 


AGAAGACGAT 


GAGGACGAGG TGGACGACTT GCCGAGCTCT 


960 




CGCCGGCCCT 


GGCGGGGCCC 


CATCTCTCGC AAGGCCAGCC AGACCAGCGT GTACCTGCAG 


1020 




GAGTGGGACA 


TCCCCTTCGA 


GCAGGTAGAG 


CTGGGCGAGC CCATCGGGCA GGGCCGCTGG 


1080 




GGCCGGGTGC 


ACCGCGGCCG 


CTGGCATGGC 


GAGGTGGCCA TTCGCCTGCT 


GGAGATGGAC 


1140 




GGCCACAACC 


AGGACCACCT 


GAAGCTCTTC AAGAAAGAGG TGATGAACTA CCGGCAGACG 


1200 


15 


CGGCATGAGA 


ACGTGGTGCT 


CTTCATGGGG GCCTGCATGA ACCCGCCCCA CCTGGCCATT 


1260 




ATCACCAGCT 


TCTGCAAGGG 


GCGGACGTTG 


CACTCGTTTG TGAGGGACCC 


CAAGACGTCT 


1320 




CTGGACATCA 


ACAAGACGAG 


GCAAATCGCT 


CAGGAGATCA TCAAGGGCAT 


GGGATATCTT 


1380 




CATGCCAAGG 


GCATCGTACA 


CAAAGATCTC 


AAATCTAAGA ACGTCTTCTA 


TGACAACGGC 


1440 




AAGGTGGTCA 


TCACAGACTT 


CGGGCTGTTT 


GGGATCTCAG GCGTGGTCCG 


AGAGGGACGG 


1500 


20 


CGTGAGAACC 


AGCTAAAGCT 


GTCCCACGAC 


TGGCTGTGCT ATCTGGCCCC 


TGAGATTGTA 


1560 




CGCGAGATGA 


CCCCCGGGAA 


GGACGAGGAT 


CAGCTGCCAT TCTCCAAAGC 


TGCTGATGTC 


1620 




TATGCATTTG 


GGACTGTTTG 


GTATGAGCTG 


CAAGCAAGAG ACTGGCCCTT 


GAAGAACCAG 


1680 




GCTGCAGAGG 


CATCCATCTG 


GCAGATTGGA 


AGCGGGGAAG GAATGAAGCG 


TGTCCTGACT 


1740 




TCTGTCAGCT 


TGGGGAAGGA 


AGTCAGTGAG 


ATCCTGTCGG CCTGCTGGGC 


tttcgacctg 


1800 


25 


CAGGAGAGAC 


CCAGCTTCAG 


CCTGCTGATG 


GACATGCTGG AGAAACTTCC 


CAAGCTGAAC 


1860 




CGGCGGCTCT 


CCCACCCTGG 


ACACTTCTGG 


AAGTCAGCTG AGTTGTAGGC 


CTGGCTGCCT 


1920 




TCCATGCACC 


AGGGGCTTTC 


TTCCTCCTAA 


TCAACAACTC AGCACCGTGA 


cttctgctaa 


1980 




AATGCAAAAT 


GAGATGCGGG 


CACTAACCCA 


GGGGATGCCA CCTCTGCTGC 


TCCAGTCGTC 


2040 




TCTCTCGAGG 


ctacttcttt 


TGCTTTGTTT 


TAAAAACTGG CCCTCTGCCC 


TCTCCACGTG 


2100 


30 


GCCTGCATAT 


GCCCAAGCCG 


GAATTC 






2126 



(2) INFORHATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 635 amino acids 
35 <B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 
(ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
40 Glu Phe Arg His Thr Ser Ala Leu Thr Gin His Thr Ala His Thr Gin 

15 10 15 

His Thr Ser Ala His Thr Gin His Ser Phe He Thr Pro Pro Thr Thr 

20 25 30 

Pro Gin Leu Arg Arg His Thr Lys Leu Lys Pro Pro Arg Thr Pro Pro 
45 35 40 45 
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Pro Pro Ser Arg Lys Val Phe Gin Leu Leu Pro Ser Phe Pro Thr Leu 

■ 50''- 55 ' : " ' 60 • 

Thr Arg Ser Lys Ser His Glu Ser Gin Leu Gly Asn Arg He Asp Asp 
65 70 75 80 

Val Ser Ser Met Arg Phe Asp Leu Ser His Gly Ser Pro Gin Met Val 
5 85 90 95 

Arg Arg Asp He Gly Leu Ser Val Thr His Arg Phe Ser Thr Lys Ser 

100 105 110 

Trp Leu Ser Gin Val Cys His Val Cys Gin Lys Ser Met He Phe Gly 
115 120 125 

10 Val Lys Cys Lys His Cys Arg Leu Lys cys His Asn Lys Cys Tnr Lys 

130 135 140 

Glu Ala Pro Ala Cys Arg He Ser Phe Leu Pro Leu Thr Arg Leu Arg 
145 150 155 160 

Arg Thr Glu Ser Val Pro Ser Asp He Asn Asn Pro Val Asp Arg Ala 
15 165 170 175 

Ala Glu Pro His Phe Gly Thr Leu Pro Lys Ala Leu Thr Lys Lys Glu 

180 185 190 

His Pro Pro Ala Met Asn His Leu Asp Ser Ser Ser Asn Pro Ser Ser 
195 200 205 

20 Thr Thr Ser Ser Thr Pro Ser Ser Pro Ala Pro Phe Pro Thr Ser Ser 

210 215 220 

Asn Pro Ser Ser Ala Thr Thr Pro Pro Asn Pro Ser Pro Gly Gin Arg 
225 230 235 240 

Asp Ser Arg Phe Asn Phe Pro Ala Ala Tyr Phe He His His Arg Gin 
25 245 250 255 

Gin Phe He Phe Pro Asp He Ser Ala Phe Ala His Ala Ala Pro Leu 

260 265 270 

Pro Glu Ala Ala Asp Gly Thr Arg Leu Asp Asp Gin Pro Lys Ala Asp 
275 280 285 

30 Val Leu Glu Ala His Glu Ala Glu Ala Glu Glu Pro Glu Ala Gly Lys 

290 295 300 

Ser Glu Ala Glu Asp Asp Glu Asp Glu Val Asp Asp Leu Pro Ser Ser 
305 310 315 320 

Arg Arg Pro Trp Arg Gly Pro He Ser Arg Lys Ala Ser Gin Thr Ser 
35 325 330 335 

Val Tyr Leu Gin Glu Trp Asp He Pro Phe Glu Gin Val Glu Leu Gly 

340 345 350 

Glu Pro He Gly Gin Gly Arg Trp Gly Arg Val His Arg Gly Arg Trp 
355 360 365 

40 His Gly Glu Val Ala He Arg Leu Leu Glu Met Asp Gly His Asn Gin 

370 375 380 

Asp His Leu Lys Leu Phe Lys Lys Glu Val Met Asn Tyr Arg Gin Thr 
385 390 395 400 

Arg His Glu Asn Val Val Leu Phe Met Gly Ala Cys Met Asn Pro Pro 
45 405 410 415 
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His Leu Ala lie He Thr Ser Phe Cys Lys Gly Arg Thr Leu His Ser 

. 420 ... , 425 430 

Phe Val Arg Asp Pro Lys Thr Ser Leu Asp He Asn Lys Thr Arg Gin 

435 440 445 

He Ala Gin Glu He He Lys Gly Met Gly Tyr Leu His Ala Lys Gly 
s 450 455 460 

He Val His Lys Asp Leu Lys Ser Lys Asn Val Phe Tyr Asp Asn Gly 
465 470 475 480 

Lys Val Val He Thr Asp Phe Gly Leu Phe Gly He Ser Gly Val Val 
485 490 495 

10 Arg Glu Gly Arg Arg Glu Asn Gin Leu Lys Leu Ser His Asp Trp Leu 

500 505 510 

Cys Tyr Leu Ala Pro Glu He Val Arg Glu Met Thr Pro Gly Lys Asp 

515 520 525 

Glu Asp Gin Leu Pro Phe Ser Lys Ala Ala Asp Val Tyr Ala Phe Gly 
15 530 535 540 

Thr Val Trp Tyr Glu Leu Gin Ala Arg Asp Trp Pro Leu Lys Asn Gin 
545 550 555 560 

Ala Ala Glu Ala Ser He Trp Gin He Gly Ser Gly Glu Gly Met Lys 
565 570 575 

20 Arg Val Leu Thr Ser Val Ser Leu Gly Lys Glu Val Ser Glu He Leu 

580 585 590 

Ser Ala Cys Trp Ala Phe Asp Leu Gin Glu Arg Pro Ser Phe Ser Leu 

595 600 605 

Leu Met Asp Met Leu Glu Lys Leu Pro Lys Leu Asn Arg Arg Leu Ser 
25 610 615 620 

His Pro Gly His Phe Trp Lys Ser Ala Glu Leu 
625 630 635 

(2) INFORMATION FOR SEQ ID NO: 11: 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 326 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 
35 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Asp Ala Lys Ser Ser Glu Glu Asn Trp Asn He Leu Ala Glu Glu He 
15 10 15 

Leu He Gly Pro Arg He Gly Ser Gly Ser Phe Gly Thr Val Tyr Arg 
40 20 25 30 

Ala His Trp His Gly Pro Val Pro Val Lys Thr Leu Asn Val Lys Thr 

35 40 45 

Pro Ser Pro Ala Gin Leu Gin Ala Phe Lys Asn Glu Val Ala Met Leu 
50 55 60 

45 Lys Lys Thr Arg His Cys Asn He Leu He Phe Met Gly Cys Val Ser 
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10 



15 



20 



25 



30 



65 70 75 80 

Lys Pro Ser Leu Ala lie Val Thr Gin Trp Cys Glu Gly Ser Ser Leu 

85 90 95 

Tyr Lys His Val His Val Ser Glu Thr Lys Phe Lys Leu Asn Thr Leu 

100 105 110 

He Asp He Gly Arg Gin Val Ala Gin Gin Met Asp Tyr Leu His Ala 

115 120 125 

Lys Asn He He His Arg Asp Leu Lys Ser Asn Asn He Phe Leu His 

130 135 140 

Glu Asp Leu Ser Val Lys He Gly Asp Phe Gly Leu Ala Thr Ala Lys 
145 150 155 160 

Thr Arg Trp Ser Gly Glu Lys Gin Ala Asn Gin Pro Thr Gly Ser He 

165 170 175 

Leu Trp Met Ala Pro Glu Val He Arg Met Gin Glu Leu Asn Pro Tyr 

180 185 190 

Ser Phe Gin Ser Asp Val Tyr Ala Phe Gly He Val Met Tyr Glu Leu 

195 200 205 

Leu Ala Glu Cys Leu Pro Tyr Gly His He Ser Asn Lys Asp Gin He 

210 215 220 

Leu Phe Met Val Gly Arg Gly Leu Leu Arg Pro Asp Met Ser Gin Val 
225 230 235 240 

Arg Ser Asp Ala Arg Arg His Ser Lys Arg He Ala Glu Asp Cys He 

245 250 255 

Lys Tyr Thr Pro Lys Asp Arg Pro Leu Phe Arg Pro Leu Leu Trp Met 

260 265 270 

Leu Glu Asn Met Leu Arg Thr Leu Pro Lys He His Arg Ser Ala Ser 

275 280 285 

Glu Pro Asn Leu Thr Gin Ser Gin Leu Gin Asn Asp Glu Phe Leu Tyr 

290 295 300 

Leu Pro Ser Pro Lys Thr Pro Val Asn Phe Asn Asn Phe Gin Phe Phe 
305 310 315 320 

Gly Ser Ala Gly Asn He 
325 



(2) INFORMATION FOR SEQ ID NO: 12: 
35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 315 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 
40 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Gly Gin Arg Asp Ser Ser Tyr Tyr Trp Glu He Glu Ala Ser Glu Val 
15 10 15 

Met Leu Ser Thr Arg He Gly Ser Gly Ser Phe Gly Thr Val Tyr Lys 
45 20 25 30 
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10 



15 



20 



25 



30 



35 



Cys Lys Trp His Gly Asp Val Ala Val Lys He Leu Lys Val Val Asp 

35 40 ■: ■ 45 • 

Pro Thr Pro Glu Gin Phe Gin Ala Phe Arg Asn Glu Val Ala Val Leu 

50 55 60 

Arg Lys Thr Arg His Val Asn He Leu Leu Phe Met Gly Tyr Met Thr 
65 70 75 80 

Lys Asp Asn Leu Ala He Val Thr Gin Trp Cys Glu Gly Ser Ser Leu 

85 90 95 

Tyr Lys His Leu His Val Gin Glu Thr Lys Phe Gin Met Phe Gin Leu 

100 105 110 

lie Asp He Ala Arg Gin Thr Ala Gin Gly Met Asp Tyr Leu His Ala 

115 120 125 

Lys Asn He He His Arg Asp Met Lys Ser Asn Asn He Phe Leu His 

130 13S 140 

Glu Gly Leu Thr Val Lys He Gly Asp Phe Gly Leu Ala Thr Val Lys 
145 150 155 160 

Ser Arg Trp Ser Gly Ser Gin Gin Val Glu Gin Pro Thr Gly Ser Val 

165 170 175 

Leu Trp Met Ala Pro Glu Val He Arg Met Gin Asp Asn Asn Pro Phe 

180 185 190 

Ser Phe Gin Ser Asp Val Tyr Ser Tyr Gly He Val Leu Tyr Glu Leu 

195 200 205 

Met Thr Gly Glu Leu Pro Tyr Ser His He Asn Asn Arg Asp Gin He 

210 215 220 

He Phe Met Val Gly Arg Gly Tyr Ala Ser Pro Asp Leu Sex Lys Leu 
225 230 235 240 

Tyr Lys Asn Cys Pro Lys Ala Met Lys Arg Leu Val Ala Asp Cys Val 

245 250 255 

Lys Lys Val Lys Glu Glu Arg Pro Leu Phe Pro Gin He Leu Ser Ser 

260 265 270 

He Glu Leu Leu Gin His Ser Leu Pro Lys He Asn Arg Ser Ala Ser 

275 280 285 

Glu Pro Ser Leu His Arg Ala Ala His Thr Glu Asp He Asn Ala Cys 

290 295 300 

Thr Leu Thr Thr Ser Pro Arg Leu Pro Val Phe 
305 310 315 



45 

SUBSTITUTE SHEET (RULE 26) 



WO 97/21820 



PCT/US96/19941 



xirfj at T£ CLAIMED IS: 

1. An isolated kinase suppressor of ras (Ksr) protein. 

2. An isolated kinase suppressor of ras (Ksr) protein according to claim 1, wherein said protein is 
mammalian. 

3. An isolated kinase suppressor of ras (Ksr) protein according to claim 1» wherein said protein is 
human. 

4. An isolated nucleic acid encoding a kinase suppressor of ras (Ksr) according to claim 1. 

5. An isolated nucleic acid encoding a kinase suppressor of ras (Ksr) according to claim 1 , said nucleic 
acid capable of hybridizing with SEQUENCE ID NO: 1 , 3, 5, or 7 under low stringency conditions. 

6. An isolated nucleic acid having the sequence defined by or complementary or reverse 
complementary to SEQUENCE ID NO: 1, 3. 5 or 7 t or a fragment thereof capable of hybridizing with a 
nucleic acid having the sequence defined by SEQUENCE ID NO: 1. 3, 5 or 7 under low stringency 
conditions. 

7. A nucleic acid according to claim 5. wherein said low stringency conditions 
arc defined by a hybridization buffer consisting essentially of \% Bovine 

Serum Albumin (BSA); 500 mM sodium phosphate (NaP0 4 ); ImM EDTA; 7% SDS at a 
temperature of 42°C and a wash buffer consisting essentially of 2X SSC (600 mM 
NaCl; 60 mM Na Citrate); 0.1% SDS at 50°C. 

8. A nucleic acid according to claim 5, wherein said low stringency conditions 
are defined by a hybridization buffer consisting essentially of 1% Bovine 

Serum Albumin (BSA); 500 mM sodium phosphate (NaP04); 15% fonrjarnide; 1 mM 
EDTA; 7% SDS at a temperature of 50 °C and a wash buffer consisting essentially 
of IX SSC (300 mM NaCl; 30 mM Na Citrate); 0.1% SDS at 50°C. 

9. A nucleic acid according to claim 5, wherein said low stringency conditions 
are defined by a hybridization buffer consisting essentially of 1 % Bovine 

Serum Albumin (BSA); 200 mM sodium phosphate (NaP04); 15% formamide; ImM EDTA; 
7% SDS at a temperature of 50°C and a wash buffer consisting essentially of 
0.5X SSC (150 mM NaCl; 15 mM Na Citrate); 0. 1% SDS at 65°C 
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10. A nucleic acid according to claim 6. wherein said low stringency conditions are defined by a 
hybridization buffer consisting essentially of 1 % Bovine Serum Albumin (BS A); 500 mM sodium phosphate 
(NaPO«); ImM EDTA; 7% SDS at a temperature of 42 °C and a wash buffer consisting essentially of 2X 
SSC (600 mM NaCl: 60 mM Na Citrate); 0. 1 % SDS at 50°C. 

5 U. A nucleic acid according to claim 6, wherein said low stringency conditions 
are defined by a hybridization buffer consisting essentially of \% Bovine 
Serum Albumin (BSA); 500 mM sodium phosphate (NaPCW); 15% formamide; 1 mM 
EDTA; 7% SDS at a temperature of 50 *C and a wash buffer consisting essentially 
of IX SSC (300 mM NaCl; 30 mM Na Citrate); 0.1% SDS at 50°C. 

10 

12. A nucleic acid according to claim 6, wherein said low stringency conditions 
are defined by a hybridization buffer consisting essentially of 1% Bovine 

Serum Albumin (BSA); 200 mM sodium phosphate (NaPCM); 15% formamide; ImM EDTA; 
7% SDS at a temperature of 50°C and a wash buffer consisting essentially of 
15 0.5X SSC (150 mM NaCl; 15 mM Na Citrate); 0.1% SDS at 65°C. 

13. A method of identifying lead compounds for a pharmacological agent useful in the diagnosis or 
treatment of disease, said method comprising the steps of: 

forming a mixture comprising: 
20 a Ksr according to claim 1 , 

a natural intracellular Ksr binding target, wherein said binding target is capable of 
specifically binding said Ksr, and 

a candidate pharmacological agent; 
incubating said mixture under conditions whereby, but for the presence of said candidate 
25 pharmacological agent, said Ksr selectively binds said binding target at a first binding affinity; 
detecting a second binding affinity of said Ksr to said binding target 

wherein a difference between said first and second binding affinity indicates that said candidate 
pharmacological agent is a lead compound for a pharmacological agent capable of modulating Ksr- 
dependent signal transduction. 

30 

14. A method according to claim 14, wherein said Ksr binding target comprises a 14-3-3 gene product. 

15. A method according to claim 14, wherein said Ksr binding target comprises a Ksr protein. 

35 16. A method of identifying lead compounds for a pharmacological agent useful in the diagnosis or 
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treatment of disease, said method comprising the steps of: 
forming a mixture comprising: 

a Ksr according to claim 1 , 

a substrate, wherein Ksr is capable of specifically phosphorylating said substrate, and 
a candidate pharmacological agent; 
incubating said mixture under conditions whereby, but for the presence of said candidate 
pharmacological agent, said Ksr selectively phosphorylates said substrate at a first rate; 
detecting a second rate of phosphorylation of said substrate by said Ksr, 
wherein a difference between said first and second rate indicates that said candidate 
pharmacological agent is a lead compound for a pharmacological agent capable of modulating Ksr kinase 
activity. 

17. A method according to claim 16 wherein said Ksr substrate comprises a 14-3-3 gene product.. 

18. A method according to claim 16 wherein said Ksr substrate comprises a Ksr protein. 

19. A vector comprising a nucleic acid according to claim 5 operably linked to a transcription regulatory 
region not naturally lined to a Ksr-encoding gene. 

20. A host cell comprising a vector according to claim 19. 

21 . A method of making a Ksr protein, said method comprising incubating a cell according to claim 20. 

22. A recombinant isolated Ksr protein produced by a cell according to claim 20. 

23. A recombinant isolated Ksr protein according to claim 22, wherein said cell is a mammalian cell, 
an avian cell, an insect cell, a fungal cell, an amphibian cell or a fish cell. 
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